Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

⚠️ These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task Auto-GPT gpt-engineer mini-agi smol-developer
Write File tbd
Read File tbd
Search File tbd

Code

Task Auto-GPT gpt-engineer mini-agi smol-developer
Debug Simple Typo With Guidance tbd
Debug Simple Typo Without Guidance tbd
Basic Code Generation tbd
Create Simple Web Server tbd

Memory

Task Auto-GPT
Basic Memory
Remember Multiple Ids
Remember Multiple Ids With Noise
Remember Multiple Phrases With Noise
Description
No description provided
Readme MIT 81 MiB
Languages
JavaScript 68.5%
Python 18.3%
Jupyter Notebook 8.3%
Dart 3.4%
C++ 0.4%
Other 0.8%