mirror of
https://github.com/aljazceru/Auto-GPT.git
synced 2026-02-09 08:14:27 +01:00
- Fixed `--mock` mode
- Moved interrupt to beginning of the step iterator pipeline (from `BuiltinChallenge` to `agent_api_interface.py:run_api_agent`). This ensures that any finish-up code is properly executed after executing a single step.
- Implemented mock mode in `WebArenaChallenge`
- Fixed `fixture 'i_attempt' not found` error when `--attempts`/`-N` is omitted
- Fixed handling of `python`/`pytest` evals in `BuiltinChallenge`
- Disabled left-over Helicone code (see 056163e)
- Fixed a couple of challenge definitions
- WebArena task 107: fix spelling of months (Sepetember, Octorbor *lmao*)
- synthesize/1_basic_content_gen (SynthesizeInfo): remove empty string from `should_contain` list
- Added some debug logging in agent_api_interface.py and challenges/builtin.py
This is the official challenge library for https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks
The goal of this repo is to provide easy challenge creation for test driven development with the Auto-GPT-Benchmarks package. This is essentially a library to craft challenges using a dsl (jsons in this case).
This is the up to date dependency graph: https://sapphire-denys-23.tiiny.site/
How to use
Make sure you have the package installed with pip install agbenchmark.
If you would just like to use the default challenges, don't worry about this repo. Just install the package and you will have access to the default challenges.
To add new challenges as you develop, add this repo as a submodule to your project/agbenchmark folder. Any new challenges you add within the submodule will get registered automatically.