Add support for directory paths in filenames and improve code splitting

- Enforce an explicit markdown code block format
- Add a token to split the output to clearly detect when the code blocks start
- Save all non-code output to a `README.md` file
- Update RegEx to extract and strip text more reliably and clean up the output
- Update the identify prompts appropriately
This commit is contained in:
Enzo Martin
2023-06-18 10:06:36 +02:00
parent e29e7bec2f
commit e7df947b98
3 changed files with 48 additions and 29 deletions

View File

@@ -1,28 +1,41 @@
import re
from typing import List, Tuple
from gpt_engineer.db import DB
def parse_chat(chat):# -> List[Tuple[str, str]]:
# Split the chat into sections by the '*CODEBLOCKSBELOW*' token
split_chat = chat.split('*CODEBLOCKSBELOW*')
def parse_chat(chat) -> List[Tuple[str, str]]:
# Get all ``` blocks
regex = r"```(.*?)```"
# Check if the '*CODEBLOCKSBELOW*' token was found
is_token_found = len(split_chat) > 1
matches = re.finditer(regex, chat, re.DOTALL)
# If the '*CODEBLOCKSBELOW*' token is found, use the first part as README and second part as code blocks.
# Otherwise, treat README as optional and proceed with empty README and the entire chat as code blocks
readme = split_chat[0].strip() if is_token_found else 'No readme'
code_blocks = split_chat[1] if is_token_found else chat
# Get all ``` blocks and preceding filenames
regex = r"\[(.*?)\]\s*```.*?\n(.*?)```"
matches = re.finditer(regex, code_blocks, re.DOTALL)
files = []
for match in matches:
path = match.group(1).split("\n")[0]
# Strip the filename of any non-allowed characters and convert / to \
path = re.sub(r'[<>"|?*]', '', match.group(1))
# Get the code
code = match.group(1).split("\n")[1:]
code = "\n".join(code)
code = match.group(2)
# Add the file to the list
files.append((path, code))
# Add README to the list
files.append(('README.txt', readme))
# Return the files
return files
def to_files(chat: str, workspace: DB):
workspace["all_output.txt"] = chat
def to_files(chat, workspace):
workspace['all_output.txt'] = chat
files = parse_chat(chat)
for file_name, file_content in files:

View File

@@ -1,15 +1,18 @@
You will get instructions for code to write.
You will write a very long answer. Make sure that every detail of the architecture is, in the end, implemented as code.
Following best practices and formatting for a README.md file, you will write a very long answer, make sure to provide the instructions on how to run the code.
Make sure that every detail of the architecture is, in the end, implemented as code.
You will first lay out the names of the core classes, functions, methods that will be necessary, As well as a quick comment on their purpose.
Then you will output the content of each file, with syntax below.
(You will start with the "entrypoint" file, then go to the ones that are imported by that file, and so on.)
Before you start outputting the code, you will output a seperator in the form of a line containing "*CODEBLOCKSBELOW*"
Make sure to create any appropriate module dependency or package manager dependency definition file.
Then you will reformat and output the content of each file strictly following a markdown code block format, where the following tokens should be replaced such that [FILENAME] is the lowercase file name including the file extension, [LANG] is the markup code block language for the code's language, and [CODE] is the comments and code:
[FILENAME]
```[LANG]
[CODE]
```
You will start with the "entrypoint" file, then go to the ones that are imported by that file, and so on.
Follow a language and framework appropriate best practice file naming convention.
Make sure that files contain all imports, types etc. Make sure that code in different files are compatible with each other.
Ensure to implement all code, if you are unsure, write a plausible implementation.
Before you finish, double check that all parts of the architecture is present in the files.
File syntax:
```file.py/ts/html
[ADD YOUR CODE HERE]
```

View File

@@ -1,13 +1,16 @@
Please now remember the steps:
First lay out the names of the core classes, functions, methods that will be necessary, As well as a quick comment on their purpose.
Then output the content of each file, with syntax below.
(You will start with the "entrypoint" file, then go to the ones that are imported by that file, and so on.)
Make sure that files contain all imports, types, variables etc. The code should be fully functional. If anything is unclear, just make assumptions. Make sure that code in different files are compatible with each other.
Before you finish, double check that all parts of the architecture is present in the files.
File syntax:
```filename.py/ts/html
[ADD YOUR CODE HERE]
Make sure to provide instructions for running the code.
Before you start outputting the code, you will output a seperator in the form of a line containing "*CODEBLOCKSBELOW*"
Make sure to create any appropriate module dependency or package manager dependency definition file.
Then you will reformat and output the content of each file strictly following a markdown code block format, where the following tokens should be replaced such that [FILENAME] is the lowercase file name including the file extension, [LANG] is the markup code block language for the code's language, and [CODE] is the comments and code:
[FILENAME]
```[LANG]
[CODE]
```
You will start with the "entrypoint" file, then go to the ones that are imported by that file, and so on.
Follow a language and framework appropriate best practice file naming convention.
Make sure that files contain all imports, types etc. The code should be fully functional. Make sure that code in different files are compatible with each other.
Before you finish, double check that all parts of the architecture is present in the files.