mirror of https://github.com/aljazceru/Auto-GPT.git synced 2026-01-31 11:54:30 +01:00

Files

merwanehamadi 295702867a Ability to run by categories (#5229 )

* Ability to run by categories

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>

* always use Path.cwd()

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>

---------

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>

2023-09-15 20:04:12 -07:00

.vscode

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

agbenchmark

Ability to run by categories (#5229 )

2023-09-15 20:04:12 -07:00

agbenchmark_config

Support agent protocol in benchmark (#5213 )

2023-09-13 18:50:39 -07:00

agent_protocol_client

Ability to run by categories (#5229 )

2023-09-15 20:04:12 -07:00

backend

benchmark-fix

2023-09-11 21:37:23 -07:00

frontend

small data changes

2023-09-11 18:20:03 -07:00

notebooks

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

reports

Merge branch 'master' into feat/monitor

2023-09-11 18:21:52 -07:00

.env.example

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

.flake8

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

.gitignore

Ability to run by categories (#5229 )

2023-09-15 20:04:12 -07:00

.pre-commit-config.yaml

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

.python-version

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

agents_to_benchmark.json

Name agents like their github repos

2023-09-07 17:25:50 -07:00

LICENSE

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

mypy.ini

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

poetry.lock

add benchmark endpoints mock (#5221 )

2023-09-15 08:48:12 -07:00

pyproject.toml

add benchmark endpoints mock (#5221 )

2023-09-15 08:48:12 -07:00

README.md

Auto-GPT-20230905085638

2023-09-05 10:10:03 -07:00

run.sh

Fixing benchmarks

2023-09-11 17:41:27 -07:00

server.py

Add back api mode

2023-09-06 22:51:45 -07:00

README.md

Auto-GPT Benchmarks

Built for the purpose of benchmarking the performance of agents regardless of how they work.

Objectively know how well your agent is performing in categories like code, retrieval, memory, and safety.

Save time and money while doing it through smart dependencies. The best part? It's all automated.

Scores:

Ranking overall:

Detailed results:

Click here to see the results and the raw data!!

More agents coming soon !