aljaz/Auto-GPT

mirror of https://github.com/aljazceru/Auto-GPT.git synced 2026-02-10 00:34:30 +01:00

Go to file

Auto-GPT-Bot 622e0a2d62 smol-developer-20230720081909

2023-07-20 08:19:09 +00:00

Integrate Beebot (#169 )

2023-07-19 13:37:29 -07:00

init agbenchmark

2023-06-18 11:14:54 -04:00

Fixing memory challenges, naming, testing mini-agi, smooth retrieval scaling (#166 )

2023-07-17 19:41:58 -07:00

Change beebot submodule (#170 )

2023-07-19 14:53:42 -07:00

gpt-engineer-20230716225908

2023-07-16 22:59:08 +00:00

smol-developer-20230720081909

2023-07-20 08:19:09 +00:00

.env.example

Dynamic home path for runs (#119 )

2023-07-16 18:24:06 -07:00

.flake8

Add static linters ci (#45 )

2023-07-02 16:14:49 -04:00

.gitignore

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

.gitmodules

Change beebot submodule (#170 )

2023-07-19 14:53:42 -07:00

.python-version

Add static linters ci (#45 )

2023-07-02 16:14:49 -04:00

json_to_base_64.py

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

LICENSE

init agbenchmark

2023-06-18 11:14:54 -04:00

mypy.ini

Added --test, consolidate files, reports working (#83 )

2023-07-10 19:25:19 -07:00

poetry.lock

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

pyproject.toml

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

README.md

Update Auto-GPT score (#106 )

2023-07-15 09:53:56 -07:00

send_to_googledrive.py

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

README.md

Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

⚠️ These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Write File	❌	✅	tbd	✅
Read File	❌	❌	tbd	❌
Search File	❌	❌	tbd	❌

Code

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Debug Simple Typo With Guidance	❌	❌	tbd	❌
Debug Simple Typo Without Guidance	❌	❌	tbd	❌
Basic Code Generation	❌	✅	tbd	✅
Create Simple Web Server	❌	❌	tbd	❌

Memory

Task	Auto-GPT
Basic Memory	❌
Remember Multiple Ids	❌
Remember Multiple Ids With Noise	❌
Remember Multiple Phrases With Noise	❌

Languages

JavaScript 68.5%

Python 18.3%

Jupyter Notebook 8.3%

Dart 3.4%

C++ 0.4%

Other 0.8%