Commit Graph

69 Commits

Author SHA1 Message Date
Silen Naihin
f8de706a15 removing data that didnt work 2023-07-31 13:41:45 +01:00
Silen Naihin
19db3151dd Feature: Visualize Test Results (#211) 2023-07-30 23:51:17 +01:00
merwanehamadi
a6c3730ac8 Add timeout that allows teardown (#216)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-29 20:02:41 -07:00
merwanehamadi
52b8d1af07 Add timeout to agbenchmark (#215)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-29 18:36:04 -07:00
Silen Naihin
f07e7b60d4 Advanced LLM Evaluation Implementation (#205)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 10:26:19 +01:00
merwanehamadi
86f73dab68 Retry push until successful (#208)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-27 21:08:31 -07:00
merwanehamadi
88feef0f2a Benchmark all tests (#204) 2023-07-27 12:53:48 -07:00
merwanehamadi
6098b70408 Use beebot autopackai (#203)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-27 12:21:43 -07:00
Justin Torre
9fc50c25ae added new script to fix dynamic headers (#202)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
2023-07-27 14:35:31 +01:00
Silen Naihin
71e0c598d6 forcing AGENT_NAME to be defined from repo 2023-07-27 14:28:11 +01:00
merwanehamadi
eb57b15380 Add dynamic headers using environment variables (#200)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 21:26:03 -07:00
Silen Naihin
fe4bdd8f97 fixing previous 2023-07-26 23:38:25 +01:00
Silen Naihin
66d1fec07e attempting more logs 2023-07-26 23:36:45 +01:00
Silen Naihin
10c1803caa ci update (#198) 2023-07-26 23:02:38 +01:00
Silen Naihin
b778af156b verbose 2023-07-26 14:07:38 +01:00
Silen Naihin
6d806a7096 poetry install -vvv in ci 2023-07-26 14:04:55 +01:00
Silen Naihin
80506e9a3b report # bug, adding submodule challenges (#193) 2023-07-26 13:53:10 +01:00
merwanehamadi
a1e02f243c Add safety suite (#196)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-25 20:13:01 -07:00
Silen Naihin
bf863f7be2 adding Codium pr-agent 2023-07-25 19:09:08 +01:00
merwanehamadi
787c7c0b3a Add api keys (#190)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-24 20:11:48 -07:00
merwanehamadi
33f9ff86ee Fix helicone MITM (#189) 2023-07-24 18:02:37 -07:00
merwanehamadi
d385cc4941 Uninstall agbenchmark then reinstall (#188)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-24 16:48:45 -07:00
Silen Naihin
d9b3d7da37 Safety challenges, adaptability challenges, suite same_task (#177) 2023-07-24 13:57:44 -07:00
merwanehamadi
549d046dc2 Always send to google drive (#185) 2023-07-23 14:00:57 -07:00
merwanehamadi
fb8e051ec1 Update permission package (#183) 2023-07-23 12:32:23 -07:00
merwanehamadi
6713a3729f Update Helicone mitm to pin to a specific version (#182)
Co-authored-by: Justin Torre <justintorre75@gmail.com>
2023-07-23 12:24:12 -07:00
merwanehamadi
2314c72bd9 Make spreadsheet dynamic based on branch name (#181) 2023-07-23 12:05:45 -07:00
merwanehamadi
6407e258b5 Update publish_package.yml (#180) 2023-07-23 09:19:39 -07:00
merwanehamadi
b8c5c261b8 Publish pypi package (#179) 2023-07-22 08:08:03 -07:00
Silen Naihin
2b3abeff4e Integrate baby-agi (#168)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-21 11:15:42 -07:00
merwanehamadi
5746bfe806 Update submodules (#176) 2023-07-20 16:15:35 -07:00
Erik Peterson
5a3b4f3d1d Kill subprocesses when test ends (#172)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-20 15:41:59 -07:00
merwanehamadi
fd02a74b46 Disable cache (#174) 2023-07-20 13:08:48 -07:00
merwanehamadi
dcdc0c9727 Integrate Beebot (#169) 2023-07-19 13:37:29 -07:00
merwanehamadi
d46124a9d8 Push reports to google drive (#167)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-18 09:17:45 -07:00
merwanehamadi
2d8fa5ca6f Use report location (#165) 2023-07-17 20:15:10 -04:00
merwanehamadi
b904041ea1 Update reports when pushing to master (#162)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-16 15:49:36 -07:00
merwanehamadi
117e8c8dd1 Fix pipes issue (#117) 2023-07-16 08:10:53 -07:00
merwanehamadi
2704bcee5e Allow change location of reports (#115)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-16 07:26:36 -07:00
merwanehamadi
757baba3ff Remove cache true on pr (#111)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-15 18:09:29 -07:00
merwanehamadi
02dce41937 Fix ci (#110) 2023-07-15 18:00:37 -07:00
merwanehamadi
5886d75059 Add three sum challenge (#108)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
2023-07-15 19:52:42 -04:00
merwanehamadi
a9702e4629 Add basic code generation challenge (#98) 2023-07-14 13:27:48 -04:00
merwanehamadi
3a9dfa4c59 Update submodules and upload artifacts (#97)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-13 20:47:55 -07:00
merwanehamadi
48ac1c91cd Remove dependencies cache (#94)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-12 14:30:06 -07:00
merwanehamadi
e292ffebaf Enable cache (#92) 2023-07-11 21:37:49 -07:00
merwanehamadi
504634b4a6 Add custom properties to Helicone (#91) 2023-07-11 20:50:56 -07:00
merwanehamadi
22295350a6 All Agents log to helicone automatically (#85)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Justin <justintorre75@gmail.com>
2023-07-11 09:57:53 -07:00
merwanehamadi
0799be7e28 Fix tests ci (#82) 2023-07-10 21:54:25 -07:00
merwanehamadi
30ba51593f Add Helicone (#81) 2023-07-10 12:19:12 -04:00