Files
Auto-GPT/README.md
merwanehamadi dab4e90e15 Update Auto-GPT score (#106)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-15 09:53:56 -07:00

38 lines
1.7 KiB
Markdown

# Auto-GPT Benchmark
A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work
## Scores:
Radio chart for each agent coming soon !
## Detailed results
:warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon.
Interface
| Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer |
|--------------|----------|--------------------|----------|--------------------|
| Write File | :x: | :white_check_mark: | tbd | :white_check_mark: |
| Read File | :x: | :x: | tbd | :x: |
| Search File | :x: | :x: | tbd | :x: |
Code
| Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer |
|------------------------------------|----------|--------------------|----------|--------------------|
| Debug Simple Typo With Guidance | :x: | :x: | tbd | :x: |
| Debug Simple Typo Without Guidance | :x: | :x: | tbd | :x: |
| Basic Code Generation | :x: | :white_check_mark: | tbd | :white_check_mark: |
| Create Simple Web Server | :x: | :x: | tbd | :x: |
Memory
| Task | Auto-GPT |
|--------------------------------------------|----------|
| Basic Memory | :x: |
| Remember Multiple Ids | :x: |
| Remember Multiple Ids With Noise | :x: |
| Remember Multiple Phrases With Noise | :x: |