Files
gpt-engineer/gpt_engineer/chat_to_files.py
UmerHA 19a4c10b6e Langchain integration (#512)
* Added LangChain integration

* Fixed issue created by git checkin process

* Added ':' to characters to remove from end of file path

* Tested initial migration to LangChain, removed comments and logging used for debugging

* Tested initial migration to LangChain, removed comments and logging used for debugging

* Converted camelCase to snake_case

* Turns out we need the exception handling

* Testing Hugging Face Integrations via LangChain

* Added LangChain loadable models

* Renames "qa" prompt to "clarify", since it's used in the "clarify" step, asking for clarification

* Fixed loading model yaml files

* Fixed streaming

* Added modeldir cli option

* Fixed typing

* Fixed interaction with token logging

* Fix spelling + dependency issues + typing

* Fix spelling + tests

* Removed unneeded logging which caused test to fail

* Cleaned up code

* Incorporated feedback

- deleted unnecessary functions & logger.info
- used LangChain ChatLLM instead of LLM to naturally communicate with gpt-4
- deleted loading model from yaml file, as LC doesn't offer this for ChatModels

* Update gpt_engineer/steps.py

Co-authored-by: Anton Osika <anton.osika@gmail.com>

* Incorporated feedback

- Fixed failing test
- Removed parsing complexity by using # type: ignore
- Replace every ocurence of ai.last_message_content with its content

* Fixed test

* Update gpt_engineer/steps.py

---------

Co-authored-by: H <holden.robbins@gmail.com>
Co-authored-by: Anton Osika <anton.osika@gmail.com>
2023-07-23 23:30:09 +02:00

43 lines
1.1 KiB
Python

import re
def parse_chat(chat): # -> List[Tuple[str, str]]:
# Get all ``` blocks and preceding filenames
regex = r"(\S+)\n\s*```[^\n]*\n(.+?)```"
matches = re.finditer(regex, chat, re.DOTALL)
files = []
for match in matches:
# Strip the filename of any non-allowed characters and convert / to \
path = re.sub(r'[\:<>"|?*]', "", match.group(1))
# Remove leading and trailing brackets
path = re.sub(r"^\[(.*)\]$", r"\1", path)
# Remove leading and trailing backticks
path = re.sub(r"^`(.*)`$", r"\1", path)
# Remove trailing ]
path = re.sub(r"[\]\:]$", "", path)
# Get the code
code = match.group(2)
# Add the file to the list
files.append((path, code))
# Get all the text before the first ``` block
readme = chat.split("```")[0]
files.append(("README.md", readme))
# Return the files
return files
def to_files(chat, workspace):
workspace["all_output.txt"] = chat
files = parse_chat(chat)
for file_name, file_content in files:
workspace[file_name] = file_content