Commit Graph

1 Commits

Author SHA1 Message Date
Reinier van der Leer
488f40a20f feat(benchmark): JungleGym WebArena (#6691)
* feat(benchmark): Add JungleGym WebArena challenges
   - Add `WebArenaChallenge`, `WebArenaChallengeSpec`, and other logic to make these challenges work
   - Add WebArena challenges to Pytest collection endpoint generate_test.py

* feat(benchmark/webarena): Add hand-picked selection of WebArena challenges
2024-01-19 20:34:04 +01:00