# Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

## Scores:
Radio chart for each agent coming soon !

## Detailed results
:warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

| Task         | Auto-GPT | gpt-engineer       | mini-agi | smol-developer     |
|--------------|----------|--------------------|----------|--------------------|
| Write File   | :x:      | :white_check_mark: | tbd      | :white_check_mark: |
| Read File    | :x:      | :x:                | tbd      | :x:                |
| Search File  | :x:      | :x:                | tbd      | :x:                |


Code

| Task                               | Auto-GPT | gpt-engineer       | mini-agi | smol-developer     |
|------------------------------------|----------|--------------------|----------|--------------------|
| Debug Simple Typo With Guidance    | :x:      | :x:                | tbd      | :x:                |
| Debug Simple Typo Without Guidance | :x:      | :x:                | tbd      | :x:                |
| Basic Code Generation              | :x:      | :white_check_mark: | tbd      | :white_check_mark: |
| Create Simple Web Server           | :x:      | :x:                | tbd      | :x:                |


Memory

| Task                                       | Auto-GPT |
|--------------------------------------------|----------|
| Basic Memory                               | :x:      |
| Remember Multiple Ids                      | :x:      |
| Remember Multiple Ids With Noise           | :x:      |
| Remember Multiple Phrases With Noise       | :x:      |