Commit Graph

48 Commits

Author SHA1 Message Date
merwanehamadi
afb59a0778 Support agent protocol (#337)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-30 19:44:39 -07:00
SwiftyOS
e1f82d1469 added the ability to run the benchmark back 2023-08-29 16:29:45 +02:00
merwanehamadi
6715b462fd remove warning (#332) 2023-08-28 22:20:07 -07:00
Silen Naihin
59655a8d96 adding backend and a basic ui (#309) 2023-08-27 03:18:30 -04:00
merwanehamadi
6b9a75f786 Only push to gdrive correct timestamps (#318)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-16 16:43:14 -07:00
merwanehamadi
277f3e4e4d Add endpoints to power dev tool (#310) 2023-08-16 09:00:05 -07:00
Silen Naihin
1a61c66898 mock flag, workspace io fixes, mark fixes 2023-08-11 13:22:21 +01:00
Jakub Novák
c2269397f1 Use agent protocol (#278)
Signed-off-by: Jakub Novak <jakub@e2b.dev>
2023-08-11 09:04:08 +02:00
merwanehamadi
1b20e45ec1 Implement the 'explore' mode (#284) 2023-08-09 17:59:48 -07:00
merwanehamadi
6afd962270 Remove baserun because api key issue (#282)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-09 11:24:54 -07:00
merwanehamadi
14e6d4968e Integrate with baserun (#274) 2023-08-08 14:04:43 -07:00
Swifty
e0a72b86c1 AUTO-25: Add the ability to run multiple categories and to skip categories (#270) 2023-08-07 12:29:00 +01:00
Luke
9326ef7826 Feat: --cutoff and "keep_workspace_files" options (#261)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-08-06 21:14:55 -07:00
merwanehamadi
db48e7849b Add product advisor tests (#267) 2023-08-06 20:59:53 -07:00
Silen Naihin
19848f362d remove pytest-depends, rerouting functions (#250) 2023-08-06 22:35:22 +01:00
merwanehamadi
e32713be68 Helicone Lock Manager fix (#263)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-06 11:30:03 -07:00
merwanehamadi
8fa67ea466 Correct agent and benchmark commit sha (#245)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-02 14:44:14 -07:00
merwanehamadi
f41533ce62 Fix reports and add commit sha (#233)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-01 17:54:23 -07:00
Silen Naihin
f9fea473f5 Refactoring for TDD (#222) 2023-07-31 21:59:47 +01:00
Silen Naihin
19db3151dd Feature: Visualize Test Results (#211) 2023-07-30 23:51:17 +01:00
merwanehamadi
6098b70408 Use beebot autopackai (#203)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-27 12:21:43 -07:00
merwanehamadi
31897e7892 Delete reports (#201) 2023-07-27 11:42:24 -07:00
Silen Naihin
71e0c598d6 forcing AGENT_NAME to be defined from repo 2023-07-27 14:28:11 +01:00
Silen Naihin
0e6be16d07 helicone and llm eval fixes 2023-07-27 14:07:46 +01:00
merwanehamadi
5df710fd35 Add helicone dynamic headers (#199)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 16:03:13 -07:00
Silen Naihin
80506e9a3b report # bug, adding submodule challenges (#193) 2023-07-26 13:53:10 +01:00
Silen Naihin
d9b3d7da37 Safety challenges, adaptability challenges, suite same_task (#177) 2023-07-24 13:57:44 -07:00
Silen Naihin
2b3abeff4e Integrate baby-agi (#168)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-21 11:15:42 -07:00
Silen Naihin
ce4cefe7e7 Dynamic home path for runs (#119) 2023-07-16 18:24:06 -07:00
Silen Naihin
9f3a2d4f05 Dynamic cutoff and other quality of life (#101) 2023-07-15 22:10:20 -04:00
merwanehamadi
78df4915cf Remove dependencies if a specific test is asked by the user (#95)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-12 14:35:12 -07:00
Silen Naihin
8df82909b2 Added --test, consolidate files, reports working (#83) 2023-07-10 19:25:19 -07:00
merwanehamadi
573130549f Add gpt engineer to ci (#78) 2023-07-09 13:31:31 -07:00
Silen Naihin
69bd41f741 Quality of life improvements & fixes (#75) 2023-07-08 18:43:38 -07:00
Silen Naihin
bfd0d5c826 Fix home_path, local mini-agi run works (#64)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-06 18:00:45 -07:00
merwanehamadi
7102fe1a18 Rename '--reg' flag to '--maintain' (#58) 2023-07-06 00:03:45 -04:00
merwanehamadi
74fc969dd6 Add basic memory challenge (#57) 2023-07-05 23:32:28 -04:00
Silen Naihin
e25f610344 local runs, home_path config, submodule miniagi (#50) 2023-07-04 10:23:00 -07:00
merwanehamadi
101ffdbce0 Integrate with gpt engineer (#47) 2023-07-03 14:53:28 -04:00
merwanehamadi
838f72097c Add static linters ci (#45) 2023-07-02 16:14:49 -04:00
merwanehamadi
2062844fa6 Integrate one challenge to auto gpt (#44) 2023-07-02 10:38:30 -04:00
Silen Naihin
7c352b745e integrate config, agent_interface just func, hook 2023-06-30 11:55:43 -04:00
Silen Naihin
76ee994d2c read mes, remove port and host from config, etc 2023-06-27 19:19:14 -04:00
Silen Naihin
f933717d8b mini-agi, simple challenge creation, --mock flag 2023-06-27 18:17:54 -04:00
Silen Naihin
a7972ad873 regression test creation 2023-06-27 13:25:47 -04:00
Silen Naihin
84f170c9e0 fixing relative imports 2023-06-26 09:36:13 -04:00
Silen Naihin
15c5469bb1 Add automatic regression markers (#38) 2023-06-22 08:18:22 -04:00
Silen Naihin
b7deb984f7 start click, fixtures, types, challenge creation, mock run -stable (#37) 2023-06-21 11:43:18 -04:00