Commit Graph

308 Commits

Author SHA1 Message Date
merwanehamadi
d5afbbee26 Add challenge name and level to pytest logs (#4661) 2023-06-12 08:03:14 -07:00
Reinier van der Leer
a9d177eeeb Remove unused function split_file from file_operations.py (#4658) 2023-06-12 02:20:39 +02:00
merwanehamadi
2ce6ae6707 Change memory challenge expectations (#4657) 2023-06-11 14:34:02 -07:00
Erik Peterson
fd04db12fa Use prompt_toolkit to enable keyboard navigation in CLI (#4649)
* Use prompt_toolkit to enable keyboard navigation in CLI

* Also update other tests

---------

Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-06-11 14:19:42 -07:00
Kinance
bc5dbb6692 Implement Batch Summarization in MessageHistory Class to manage context length under model's token limit (#4652)
* Implement Batch Running Summarization to avoid max token error

* Rename test func
2023-06-11 13:04:41 -07:00
merwanehamadi
9150f32f8b Fix benchmark logs (#4653) 2023-06-11 15:34:57 +01:00
merwanehamadi
6fb9b6d03b Retry regression tests (#4648) 2023-06-11 15:21:26 +01:00
Auto-GPT-Bot
4e621280bb Update cassette submodule 2023-06-10 22:51:23 +00:00
Erik Peterson
0594ba33a2 Pass agent to commands instead of config (#4645)
* Add config as attribute to Agent, rename old config to ai_config

* Code review: Pass ai_config

* Pass agent to commands instead of config

* Lint

* Fix merge error

* Fix memory challenge a

---------

Co-authored-by: Nicholas Tindle <nick@ntindle.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-06-10 15:48:50 -07:00
merwanehamadi
097ce08908 Create benchmarks.yml (#4647) 2023-06-10 15:11:24 -07:00
Erik Peterson
6b9e3b21d3 Add config as attribute to Agent, rename old config to ai_config (#4638)
* Add config as attribute to Agent, rename old config to ai_config

* Code review: Pass ai_config

---------

Co-authored-by: Nicholas Tindle <nick@ntindle.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-06-10 14:47:26 -07:00
Erik Peterson
15c6b0c1c3 Implement directory-based plugin system (#4548)
* Implement directory-based plugin system

* Fix Selenium test

---------

Co-authored-by: Nicholas Tindle <nick@ntindle.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-06-10 13:16:00 -07:00
merwanehamadi
3c51ff501f dcrement memory challenge c (#4639) 2023-06-09 20:46:06 -07:00
Auto-GPT-Bot
f5a447308d Update cassette submodule 2023-06-10 00:29:33 +00:00
Auto-GPT-Bot
3f2547295f Update cassette submodule 2023-06-09 22:31:11 +00:00
Erik Peterson
5fe600af9d Clean up and fix issues with env configuration and .env.template (#4630)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-06-09 15:28:30 -07:00
merwanehamadi
3081f56ecb Quicker logs in pytest (#4486) 2023-06-09 15:18:56 -07:00
Auto-GPT-Bot
923c67e92a Update cassette submodule 2023-06-09 22:06:01 +00:00
javableu
474a9c4d95 False believes challenge based on sally anne test. (#4167)
* False believes challenge based on sally anne test.

* Update test_memory_challenge_d.py

* Update challenge_d.md

Some text appearing in bold

* Update test_memory_challenge_d.py

* Update test_memory_challenge_d.py

* Update test_memory_challenge_d.py

* Update test_memory_challenge_d.py

black  test_memory_challenge_d.py

* Update test_memory_challenge_d.py

replaced the dynamic time depending of the level to a fix time

* Update test_memory_challenge_d.py

isort command for the libraries

* Refactored memory challenge a

---------

Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-06-09 15:02:41 -07:00
Erik Peterson
ff4e53d0e6 Streamline / clarify shell command control configuration (#4628)
* Streamline / clarify shell command control configuration

* Fix lint
2023-06-09 11:48:20 -07:00
Erik Peterson
cce50bef50 Fix issues with information retrieval challenge a (#4622)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-06-09 11:02:52 -07:00
Auto-GPT-Bot
8215039785 Update cassette submodule 2023-06-09 17:46:06 +00:00
Erik Peterson
94280b2d14 Add command for directly executing python code (#4581)
* Add command for directly executing python code

* Fix docstring

* Clarify / update filename references

---------

Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-06-09 10:40:32 -07:00
merwanehamadi
3d06b2e4c0 Decrement information retrieval challenge a (#4637) 2023-06-09 10:02:03 -07:00
merwanehamadi
12ed5a957b Fix debug code challenge (#4632) 2023-06-09 08:40:06 -07:00
merwanehamadi
3b0d49a3e0 Make test write file hard (#4481) 2023-06-09 07:50:57 -07:00
merwanehamadi
bd2e26a20f Inform users that challenges can be flaky (#4616)
* Inform users that challenges can be flaky

* Update challenge_decorator.py
2023-06-09 07:43:56 -07:00
Stefan Ayala
1e851ba3ea Feat set token limits based on model (#4498)
* feat: set max token limits for better user experience

* fix: use OPEN_AI_CHAT_MODELS max limits

* fix: use the old default of 8000

* fix: formatting so isort/black checks pass

* fix: avoid circular dependencies

* fix: use better to avoid circular imports

* feat: introduce soft limits and use them

* fix: circular import issue and missing field

* fix: move import to avoid overriding doc comment

* feat: DRY things up and set token limit for fast llm models too

* tests: make linter tests happy

* test: use the max token limits in config.py test

* fix: remove fast token limit from config

* feat: remove smart token limit from config

* fix: remove unused soft_token_limit var

* fix: remove unneeded tests, settings aren't in config anymore

---------

Co-authored-by: k-boikov <64261260+k-boikov@users.noreply.github.com>
Co-authored-by: Reinier van der Leer <github@pwuts.nl>
2023-06-07 10:16:53 +02:00
Erik Peterson
fdc6e12945 Improve logic and error messages for file reading and writing with Python code (#4567)
* Fix issues with file reading and writing with Python code

* Change error message, use Workspace.get_path

---------

Co-authored-by: Reinier van der Leer <github@pwuts.nl>
2023-06-07 02:54:02 +02:00
Auto-GPT-Bot
835decc6c1 Update challenge scores 2023-06-06 21:48:57 +00:00
merwanehamadi
53efa8f6bf Update cassette submodule & fix current_score.json generation (#4601)
* Update cassette submodule

* add a new line when building current_score.json
2023-06-06 23:46:41 +02:00
Erik Peterson
055806e124 Fix inverted logic for deny_command (#4563) 2023-06-06 22:56:17 +02:00
Reinier van der Leer
dafbd11686 Rearrange tests & fix CI (#4596)
* Rearrange tests into unit/integration/challenge categories

* Fix linting + `tests.challenges` imports

* Fix obscured duplicate test in test_url_validation.py

* Move VCR conftest to tests.vcr

* Specify tests to run & their order (unit -> integration -> challenges) in CI

* Fail Docker CI when tests fail

* Fix import & linting errors in tests

* Fix `get_text_summary`

* Fix linting errors

* Clean up pytest args in CI

* Remove bogus tests from GoCodeo
2023-06-06 10:48:49 -07:00
Auto-GPT-Bot
8a881f70a3 Update cassette submodule 2023-06-06 14:35:43 +00:00
merwanehamadi
ee6b97ef5e Fix Python CI "update cassettes" step (#4591)
* Fix updated cassettes step

* Clarifications

* Use github.ref_name instead of github.ref

* Fix duplicate runs on `master`

---------

Co-authored-by: Reinier van der Leer <github@pwuts.nl>
2023-06-06 16:27:08 +02:00
Auto-GPT-Bot
60ac0c4da1 Update challenge scores 2023-06-05 14:39:45 +00:00
Reinier van der Leer
576f24e3ae Merge branch 'master' into release-0.4.0 2023-06-05 16:08:35 +02:00
Benny van der Lans
74e8a886e6 Add replace_in_file command (#4565)
Resubmission of #3643

---------

Co-authored-by: Reinier van der Leer <github@pwuts.nl>
2023-06-04 19:37:35 +02:00
merwanehamadi
59d31b021d Skip flaky challenges (#4573) 2023-06-04 18:20:13 +02:00
merwanehamadi
af28510aba Fix test_web_selenium (#4554) 2023-06-04 16:38:32 +02:00
Merwane Hamadi
02846fcf91 remove information retrieval challenge b from beaten challenges 2023-06-03 21:24:36 -07:00
Auto-GPT-Bot
378126822f Update submodule reference 2023-06-03 14:51:55 +00:00
Auto-GPT-Bot
55a8e242b0 Update current score 2023-06-03 14:51:53 +00:00
merwanehamadi
79ba85a22e Cache Python Packages in the CI pipeline (#4488) 2023-06-03 15:48:32 +01:00
Auto-GPT-Bot
64973bfe12 Update submodule reference 2023-05-30 23:33:40 +00:00
Auto-GPT-Bot
41df0204f3 Update current score 2023-05-30 23:33:38 +00:00
Douglas Schonholtz
f6ee61d607 create debug challenge (#4286)
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: symphony <john.tian31@gmail.com>
2023-05-30 16:28:32 -07:00
merwanehamadi
87776b2886 Make the information retrieval challenge a harder while still passing (#4468) 2023-05-30 15:56:58 -07:00
Auto-GPT-Bot
387f65c16c Update submodule reference 2023-05-30 19:15:33 +00:00
Auto-GPT-Bot
4c25fabec9 Update current score 2023-05-30 19:15:30 +00:00