mirror of
https://github.com/aljazceru/Auto-GPT.git
synced 2026-01-03 06:14:32 +01:00
2.5 KiB
2.5 KiB
Auto-GPT Benchmark
A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work
Scores:
Radio chart for each agent coming soon !
Detailed results
⚠️ These results are constantly evolving at the moment. We will publish an official benchmark result very soon.
Auto-GPT
Interface
| Task | Results |
|---|---|
| Write File | ✅ |
| Read File | ✅ |
| Search File | ❌ |
Code
| Task | Results |
|---|---|
| Debug Simple Typo With Guidance | ❌ |
| Debug Simple Typo Without Guidance | ❌ |
| Basic Code Generation | ✅ |
| Create Simple Web Server | ❌ |
Memory
| Task | Results |
|---|---|
| Basic Memory | ✅ |
| Remember Multiple Ids | ❌ |
| Remember Multiple Ids With Noise | ❌ |
| Remember Multiple Phrases With Noise | ❌ |
gpt-engineer
Interface
| Task | Results |
|---|---|
| Write File | ✅ |
| Read File | ❌ |
| Search File | ❌ |
Code
| Task | Results |
|---|---|
| Debug Simple Typo With Guidance | ❌ |
| Debug Simple Typo Without Guidance | ❌ |
| Basic Code Generation | ✅ |
| Create Simple Web Server | ❌ |
mini-agi
Coming Soon!
smol-developer
Interface
| Task | Results |
|---|---|
| Write File | ✅ |
| Read File | ❌ |
| Search File | ❌ |
Code
| Task | Results |
|---|---|
| Debug Simple Typo With Guidance | ❌ |
| Debug Simple Typo Without Guidance | ❌ |
| Basic Code Generation | ✅ |
| Create Simple Web Server | ❌ |