From bdd2cb040f9e8d4b5acd29b047c11ca3cefea974 Mon Sep 17 00:00:00 2001 From: Anton Osika Date: Thu, 13 Jul 2023 08:58:15 +0200 Subject: [PATCH] Update ROADMAP.md --- ROADMAP.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index 90f9f11..fd8994d 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -7,40 +7,39 @@ There are three main milestones we believe will greatly increase gpt-engineer's ## Our current focus: -- [x] Continuous evaluation of our progress ๐ŸŽ‰ +- [x] **Continuous evaluation of our progress ๐ŸŽ‰** - [x] Create a step that asks โ€œdid it run/work/perfectโ€ in the end of each run [#240](https://github.com/AntonOsika/gpt-engineer/issues/240) ๐ŸŽ‰ - [x] Collect a dataset for gpt engineer to learn from, by storing code generation runs ๐ŸŽ‰ - [ ] Run the benchmark multiple times, and document the results for the different "step configs" [#239](https://github.com/AntonOsika/gpt-engineer/issues/239) - [ ] Document the best performing configs -- [ ] Self healing code +- [ ] **Self healing code** - [ ] Run the generated tests - [ ] Feed the results of failing tests back into LLM and ask it to fix the code -- [ ] Let human give feedback +- [ ] **Let human give feedback** - [ ] Ask human for what is not working as expected in a loop, and feed it into GPT4 to fix the code, until the human is happy -- [ ] Improve existing projects +- [ ] **Improve existing projects** - [ ] Decide on the "flow" for the CLI commands and where the project files are created - [ ] Add an "improve code" command - [ ] Design how gpt-engineer becomes a platform - [ ] Integrate Aider ## Experimental research -This is not current focus, but if you are interested in experimenting, please -create a thread in our discord share your intentions in Discord's #general, and your findings as you -go along. -- [ ] Make code generation become small, verifiable steps +This is not our current focus, but if you are interested in experimenting: Please +create a thread in Discord #general and share your intentions and your findings as you +go along. High impact examples: +- [ ] **Make code generation become small, verifiable steps** - [ ] Ask GPT4 to decide how to sequence the entire generation, and do one prompt for each subcomponent - [ ] For each small part, generate tests for that subpart, and do the loop of running the tests for each part, feeding results into GPT4, and let it edit the code until they pass -- [ ] Ad hoc experiments +- [ ] **Ad hoc experiments** - [ ] Try Microsoft guidance, and benchmark if this helps improve performance - [ ] Dynamic planning: Let gpt-engineer plan which "steps" to carry out itself, depending on the task, by giving it few shot example of what are usually "the right-sized steps" to carry out for such projects ## Codebase improvements -By improving the codebase and developer ergonomics we accelerate development -acroess the board. A lot can be done, here are some examples: +By improving the codebase and developer ergonomics, we accelerate progress. Some examples: - [ ] Set up automatic PR review for all PRs with e.g. Codium pr-agent - [ ] LLM tests in CI: Run super small tests with GPT3.5 in CI, that check that simple code generation still works