Commit Graph

27 Commits

Author SHA1 Message Date
Silen Naihin
4011cb228f working bar and radar charts (#221) 2023-07-31 12:22:38 +01:00
Silen Naihin
19db3151dd Feature: Visualize Test Results (#211) 2023-07-30 23:51:17 +01:00
Silen Naihin
f07e7b60d4 Advanced LLM Evaluation Implementation (#205)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 10:26:19 +01:00
merwanehamadi
5df710fd35 Add helicone dynamic headers (#199)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 16:03:13 -07:00
Silen Naihin
66d1fec07e attempting more logs 2023-07-26 23:36:45 +01:00
Silen Naihin
d9b3d7da37 Safety challenges, adaptability challenges, suite same_task (#177) 2023-07-24 13:57:44 -07:00
merwanehamadi
7288d4ccc0 Release 0.0.2 (#186) 2023-07-23 14:03:21 -07:00
merwanehamadi
68445ae577 Change package version (#184) 2023-07-23 12:51:12 -07:00
Erik Peterson
5a3b4f3d1d Kill subprocesses when test ends (#172)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-20 15:41:59 -07:00
merwanehamadi
d46124a9d8 Push reports to google drive (#167)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-18 09:17:45 -07:00
merwanehamadi
a9702e4629 Add basic code generation challenge (#98) 2023-07-14 13:27:48 -04:00
merwanehamadi
0799be7e28 Fix tests ci (#82) 2023-07-10 21:54:25 -07:00
merwanehamadi
437e066a66 Add "Simple web server" challenge (#74)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
2023-07-10 20:46:03 -04:00
Silen Naihin
69bd41f741 Quality of life improvements & fixes (#75) 2023-07-08 18:43:38 -07:00
merwanehamadi
9ede17891b Add 'Debug simple typo with guidance' challenge (#65)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-07 13:50:53 -07:00
Silen Naihin
bfd0d5c826 Fix home_path, local mini-agi run works (#64)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-06 18:00:45 -07:00
merwanehamadi
101ffdbce0 Integrate with gpt engineer (#47) 2023-07-03 14:53:28 -04:00
merwanehamadi
838f72097c Add static linters ci (#45) 2023-07-02 16:14:49 -04:00
Silen Naihin
f933717d8b mini-agi, simple challenge creation, --mock flag 2023-06-27 18:17:54 -04:00
Silen Naihin
a2f79760ce other was non solution, solution is pytest-depends 2023-06-27 13:26:28 -04:00
Silen Naihin
06a6f08054 finally figured out right way to do dependencies 2023-06-27 13:26:28 -04:00
Silen Naihin
2f28a66591 more elegant marking & dependency solution 2023-06-27 13:26:28 -04:00
Silen Naihin
60a7ac2343 adding dependencies on other challenges 2023-06-27 13:26:28 -04:00
Silen Naihin
8c44b9eddf basic challenges, more ChallengeData structure 2023-06-27 13:26:28 -04:00
Silen Naihin
15c5469bb1 Add automatic regression markers (#38) 2023-06-22 08:18:22 -04:00
Silen Naihin
b7deb984f7 start click, fixtures, types, challenge creation, mock run -stable (#37) 2023-06-21 11:43:18 -04:00
Silen Naihin
51f2295971 init agbenchmark 2023-06-18 11:14:54 -04:00