# Auto-GPT Benchmark A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work ## Scores: Radio chart for each agent coming soon ! ## Detailed results :warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon. Interface | Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer | |--------------|----------|--------------------|----------|--------------------| | Write File | :x: | :white_check_mark: | tbd | :white_check_mark: | | Read File | :x: | :x: | tbd | :x: | | Search File | :x: | :x: | tbd | :x: | Code | Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer | |------------------------------------|----------|--------------------|----------|--------------------| | Debug Simple Typo With Guidance | :x: | :x: | tbd | :x: | | Debug Simple Typo Without Guidance | :x: | :x: | tbd | :x: | | Basic Code Generation | :x: | :white_check_mark: | tbd | :white_check_mark: | | Create Simple Web Server | :x: | :x: | tbd | :x: | Memory | Task | Auto-GPT | |--------------------------------------------|----------| | Basic Memory | :x: | | Remember Multiple Ids | :x: | | Remember Multiple Ids With Noise | :x: | | Remember Multiple Phrases With Noise | :x: |