diff --git a/README.md b/README.md index fa06317c..727fefa4 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work ## Scores: -Spider chart for each agent coming soon ! +Radio chart for each agent coming soon ! ## Detailed results :warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon. @@ -42,7 +42,7 @@ Interface | Task | Results | |-------------|--------------------| | Write File | :white_check_mark: | -| Read File | :white_check_mark: | +| Read File | :x: | | Search File | :x: | Code @@ -58,4 +58,19 @@ Code Coming Soon! ### smol-developer -Coming Soon! +Interface + +| Task | Results | +|-------------|--------------------| +| Write File | :white_check_mark: | +| Read File | :x: | +| Search File | :x: | + +Code + +| Task | Results | +|-----------------------------------|----------------------| +| Debug Simple Typo With Guidance | :x: | +| Debug Simple Typo Without Guidance| :x: | +| Basic Code Generation | :white_check_mark: | +| Create Simple Web Server | :x: | diff --git a/agent/smol-developer b/agent/smol-developer index aa823392..f4f43955 160000 --- a/agent/smol-developer +++ b/agent/smol-developer @@ -1 +1 @@ -Subproject commit aa8233925090c0c9314ceef68397ab37baf17766 +Subproject commit f4f4395511ed6ba59ec09100d6596bf81d68a898