Silen Naihin f07e7b60d4 Advanced LLM Evaluation Implementation (#205)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 10:26:19 +01:00
2023-06-18 11:14:54 -04:00
2023-07-27 20:50:53 -07:00
2023-07-27 12:21:43 -07:00
2023-07-18 09:17:45 -07:00
2023-07-27 12:21:43 -07:00
2023-07-02 16:14:49 -04:00
2023-06-18 11:14:54 -04:00
2023-07-25 11:09:49 -07:00

Auto-GPT Benchmarks

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Screenshot 2023-07-25 at 10 35 01 AM

Ranking overall:

Detailed results:

Screenshot 2023-07-25 at 10 42 15 AM

Click here to see the results and the raw data!!

More agents coming soon !

Description
No description provided
Readme MIT 81 MiB
Languages
JavaScript 68.5%
Python 18.3%
Jupyter Notebook 8.3%
Dart 3.4%
C++ 0.4%
Other 0.8%