Files
gpt-engineer/ROADMAP.md
2023-06-20 15:12:30 +02:00

1.6 KiB

Capability improvement roadmap

  • Continuous capability measurements
    • Create a step that asks “did it run/work/perfect”?Create a step that asks “did it run/work/perfect” in the end #240
    • Run the benchmark repeatedly and document the results for the different "step configs" (STEPS in steps.py) #239
    • Document the best performing configs, and feed this into our roadmap
  • Self healing code
    • Feed the results of failing tests back into GPT4 and ask it to fix the code
  • Let human give feedback
    • Ask human for what is not working as expected in a loop, and feed it into GPT4 to fix the code, until the human is happy or gives up
  • Break down the code generation into much smaller parts
    • For each small part, generate tests for each subpart, and do the loop of running the tests for each part, feeding results into GPT4, and let it edit the code until they pass
  • LLM tests in CI
    • Run very small tests with GPT3.5 in CI, to make sure we don't worsen performance over time
  • Dynamic planning
    • Let gpt-engineer plan which "steps" to carry out itself, depending on the task, by giving it few shot example of what are usually "the right-sized steps" to carry out for other projects

How you can help out

You can:

Experiments we can try

  • Microsoft guidance, and benchmark if this helps improve it