mirror of
https://github.com/aljazceru/Auto-GPT.git
synced 2025-12-25 01:44:28 +01:00
38 lines
1.7 KiB
Markdown
38 lines
1.7 KiB
Markdown
# Auto-GPT Benchmark
|
|
|
|
A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work
|
|
|
|
## Scores:
|
|
Radio chart for each agent coming soon !
|
|
|
|
## Detailed results
|
|
:warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon.
|
|
|
|
Interface
|
|
|
|
| Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer |
|
|
|--------------|----------|--------------------|----------|--------------------|
|
|
| Write File | :x: | :white_check_mark: | tbd | :white_check_mark: |
|
|
| Read File | :x: | :x: | tbd | :x: |
|
|
| Search File | :x: | :x: | tbd | :x: |
|
|
|
|
|
|
Code
|
|
|
|
| Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer |
|
|
|------------------------------------|----------|--------------------|----------|--------------------|
|
|
| Debug Simple Typo With Guidance | :x: | :x: | tbd | :x: |
|
|
| Debug Simple Typo Without Guidance | :x: | :x: | tbd | :x: |
|
|
| Basic Code Generation | :x: | :white_check_mark: | tbd | :white_check_mark: |
|
|
| Create Simple Web Server | :x: | :x: | tbd | :x: |
|
|
|
|
|
|
Memory
|
|
|
|
| Task | Auto-GPT |
|
|
|--------------------------------------------|----------|
|
|
| Basic Memory | :x: |
|
|
| Remember Multiple Ids | :x: |
|
|
| Remember Multiple Ids With Noise | :x: |
|
|
| Remember Multiple Phrases With Noise | :x: |
|