mirror of
https://github.com/aljazceru/gpt-engineer.git
synced 2025-12-17 12:45:26 +01:00
Update ROADMAP.md
This commit is contained in:
21
ROADMAP.md
21
ROADMAP.md
@@ -7,40 +7,39 @@ There are three main milestones we believe will greatly increase gpt-engineer's
|
|||||||
|
|
||||||
## Our current focus:
|
## Our current focus:
|
||||||
|
|
||||||
- [x] Continuous evaluation of our progress 🎉
|
- [x] **Continuous evaluation of our progress 🎉**
|
||||||
- [x] Create a step that asks “did it run/work/perfect” in the end of each run [#240](https://github.com/AntonOsika/gpt-engineer/issues/240) 🎉
|
- [x] Create a step that asks “did it run/work/perfect” in the end of each run [#240](https://github.com/AntonOsika/gpt-engineer/issues/240) 🎉
|
||||||
- [x] Collect a dataset for gpt engineer to learn from, by storing code generation runs 🎉
|
- [x] Collect a dataset for gpt engineer to learn from, by storing code generation runs 🎉
|
||||||
- [ ] Run the benchmark multiple times, and document the results for the different "step configs" [#239](https://github.com/AntonOsika/gpt-engineer/issues/239)
|
- [ ] Run the benchmark multiple times, and document the results for the different "step configs" [#239](https://github.com/AntonOsika/gpt-engineer/issues/239)
|
||||||
- [ ] Document the best performing configs
|
- [ ] Document the best performing configs
|
||||||
- [ ] Self healing code
|
- [ ] **Self healing code**
|
||||||
- [ ] Run the generated tests
|
- [ ] Run the generated tests
|
||||||
- [ ] Feed the results of failing tests back into LLM and ask it to fix the code
|
- [ ] Feed the results of failing tests back into LLM and ask it to fix the code
|
||||||
- [ ] Let human give feedback
|
- [ ] **Let human give feedback**
|
||||||
- [ ] Ask human for what is not working as expected in a loop, and feed it into GPT4 to fix the code, until the human is happy
|
- [ ] Ask human for what is not working as expected in a loop, and feed it into GPT4 to fix the code, until the human is happy
|
||||||
- [ ] Improve existing projects
|
- [ ] **Improve existing projects**
|
||||||
- [ ] Decide on the "flow" for the CLI commands and where the project files are created
|
- [ ] Decide on the "flow" for the CLI commands and where the project files are created
|
||||||
- [ ] Add an "improve code" command
|
- [ ] Add an "improve code" command
|
||||||
- [ ] Design how gpt-engineer becomes a platform
|
- [ ] Design how gpt-engineer becomes a platform
|
||||||
- [ ] Integrate Aider
|
- [ ] Integrate Aider
|
||||||
|
|
||||||
## Experimental research
|
## Experimental research
|
||||||
This is not current focus, but if you are interested in experimenting, please
|
This is not our current focus, but if you are interested in experimenting: Please
|
||||||
create a thread in our discord share your intentions in Discord's #general, and your findings as you
|
create a thread in Discord #general and share your intentions and your findings as you
|
||||||
go along.
|
go along. High impact examples:
|
||||||
- [ ] Make code generation become small, verifiable steps
|
- [ ] **Make code generation become small, verifiable steps**
|
||||||
- [ ] Ask GPT4 to decide how to sequence the entire generation, and do one
|
- [ ] Ask GPT4 to decide how to sequence the entire generation, and do one
|
||||||
prompt for each subcomponent
|
prompt for each subcomponent
|
||||||
- [ ] For each small part, generate tests for that subpart, and do the loop of running the tests for each part, feeding
|
- [ ] For each small part, generate tests for that subpart, and do the loop of running the tests for each part, feeding
|
||||||
results into GPT4, and let it edit the code until they pass
|
results into GPT4, and let it edit the code until they pass
|
||||||
- [ ] Ad hoc experiments
|
- [ ] **Ad hoc experiments**
|
||||||
- [ ] Try Microsoft guidance, and benchmark if this helps improve performance
|
- [ ] Try Microsoft guidance, and benchmark if this helps improve performance
|
||||||
- [ ] Dynamic planning: Let gpt-engineer plan which "steps" to carry out itself, depending on the
|
- [ ] Dynamic planning: Let gpt-engineer plan which "steps" to carry out itself, depending on the
|
||||||
task, by giving it few shot example of what are usually "the right-sized steps" to carry
|
task, by giving it few shot example of what are usually "the right-sized steps" to carry
|
||||||
out for such projects
|
out for such projects
|
||||||
|
|
||||||
## Codebase improvements
|
## Codebase improvements
|
||||||
By improving the codebase and developer ergonomics we accelerate development
|
By improving the codebase and developer ergonomics, we accelerate progress. Some examples:
|
||||||
acroess the board. A lot can be done, here are some examples:
|
|
||||||
- [ ] Set up automatic PR review for all PRs with e.g. Codium pr-agent
|
- [ ] Set up automatic PR review for all PRs with e.g. Codium pr-agent
|
||||||
- [ ] LLM tests in CI: Run super small tests with GPT3.5 in CI, that check that simple code generation still works
|
- [ ] LLM tests in CI: Run super small tests with GPT3.5 in CI, that check that simple code generation still works
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user