Commit Graph

35 Commits

Author SHA1 Message Date
merwanehamadi
a6c3730ac8 Add timeout that allows teardown (#216)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-29 20:02:41 -07:00
merwanehamadi
31897e7892 Delete reports (#201) 2023-07-27 11:42:24 -07:00
Silen Naihin
0e6be16d07 helicone and llm eval fixes 2023-07-27 14:07:46 +01:00
merwanehamadi
eb57b15380 Add dynamic headers using environment variables (#200)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 21:26:03 -07:00
merwanehamadi
5df710fd35 Add helicone dynamic headers (#199)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 16:03:13 -07:00
Silen Naihin
80506e9a3b report # bug, adding submodule challenges (#193) 2023-07-26 13:53:10 +01:00
Silen Naihin
5e3bbb946f fix suite dependencies (#194) 2023-07-26 01:50:53 +01:00
Silen Naihin
b82277515f hotfix reports (#191) 2023-07-25 19:07:24 +01:00
Silen Naihin
d9b3d7da37 Safety challenges, adaptability challenges, suite same_task (#177) 2023-07-24 13:57:44 -07:00
Silen Naihin
2b3abeff4e Integrate baby-agi (#168)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-21 11:15:42 -07:00
Silen Naihin
12c5d54583 Fixing memory challenges, naming, testing mini-agi, smooth retrieval scaling (#166) 2023-07-17 19:41:58 -07:00
Silen Naihin
dffc1dfd51 internal_info.json dynamic changes (#163) 2023-07-17 09:39:24 -04:00
Silen Naihin
9f3a2d4f05 Dynamic cutoff and other quality of life (#101) 2023-07-15 22:10:20 -04:00
Erik Peterson
cbd2e49d97 Clean up workspace between each test (#109) 2023-07-15 16:23:49 -07:00
merwanehamadi
78df4915cf Remove dependencies if a specific test is asked by the user (#95)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-12 14:35:12 -07:00
Silen Naihin
8d0c5179ed fixing backslashes, adding basic metrics (#89) 2023-07-12 01:37:59 -04:00
Silen Naihin
8df82909b2 Added --test, consolidate files, reports working (#83) 2023-07-10 19:25:19 -07:00
Silen Naihin
3d43117554 Just json, no test files (#77) 2023-07-09 17:27:21 -07:00
merwanehamadi
d89264998d Fix debug code challenge (#76)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
2023-07-08 21:46:37 -04:00
Silen Naihin
69bd41f741 Quality of life improvements & fixes (#75) 2023-07-08 18:43:38 -07:00
Silen Naihin
e56b112aab i/o workspace, adding superagi (#60) 2023-07-08 03:27:31 -04:00
merwanehamadi
74fc969dd6 Add basic memory challenge (#57) 2023-07-05 23:32:28 -04:00
Silen Naihin
bfc7dfdb29 Dynamic workspace path (#56) 2023-07-04 19:06:49 -07:00
merwanehamadi
838f72097c Add static linters ci (#45) 2023-07-02 16:14:49 -04:00
merwanehamadi
2062844fa6 Integrate one challenge to auto gpt (#44) 2023-07-02 10:38:30 -04:00
Silen Naihin
2987d71264 moving run agent to tests & agnostic run working 2023-06-30 10:50:54 -04:00
Silen Naihin
fce421fb33 moving logic to benchmark.py file 2023-06-29 20:51:23 -04:00
Silen Naihin
ac5af73696 trying to get kill process 2023-06-28 21:28:46 -04:00
Silen Naihin
f933717d8b mini-agi, simple challenge creation, --mock flag 2023-06-27 18:17:54 -04:00
Silen Naihin
fa0df12439 mini agi attempt 2023-06-27 13:26:28 -04:00
Silen Naihin
2411c35d0e update regression tests info 2023-06-27 13:26:28 -04:00
Silen Naihin
22458a04e8 file creation from within file before server :) 2023-06-27 13:26:28 -04:00
Silen Naihin
ffd1d15a0e MockManager, mock_func in data.json (#39) 2023-06-23 07:53:57 -04:00
Silen Naihin
15c5469bb1 Add automatic regression markers (#38) 2023-06-22 08:18:22 -04:00
Silen Naihin
b7deb984f7 start click, fixtures, types, challenge creation, mock run -stable (#37) 2023-06-21 11:43:18 -04:00