Commit Graph

4404 Commits

Author SHA1 Message Date
SwiftyOS
186508e75c Removed flutter and chrome from setup as not required 2023-09-21 08:06:26 +02:00
hunteraraujo
62efc6b07e Add Firebase Analytics dependency 2023-09-20 20:40:12 -07:00
hunteraraujo
22ea449850 Integrate LeaderboardService into SkillTreeViewModel
This commit integrates the `LeaderboardService` into `SkillTreeViewModel` to enable benchmark report submissions to the leaderboard. A `BenchmarkRun` object is created from the evaluation response and submitted using the `submitReport` method from `LeaderboardService`.
2023-09-20 19:36:25 -07:00
merwanehamadi
ff4c76ba00 Make agbenchmark a proxy of the evaluated agent (#5279)
Make agbenchmark a Proxy of the evaluated agent

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-20 16:06:00 -07:00
hunteraraujo
1a471b73cd Fix _leaderboardBaseUrl 2023-09-20 14:44:29 -07:00
hunteraraujo
7901c750b6 Extend RestApiUtility to Support Leaderboard Base URL
This commit extends the `RestApiUtility` class to include support for a new leaderboard base URL. A new `ApiType` enum value `ApiType.leaderboard` has been added, and the `_getEffectiveBaseUrl` method has been updated to handle this new type. The leaderboard base URL is "https://leaderboard.vercel.app/".
2023-09-20 14:43:45 -07:00
hunteraraujo
a0512254ca Add LeaderboardService with submitReport Method
This commit adds a new `LeaderboardService` class featuring a `submitReport` method. This method allows for the submission of `BenchmarkRun` objects to the leaderboard via a POST request to the `/api/reports` endpoint. The new service uses the `ApiType.leaderboard` enum value.
2023-09-20 14:38:48 -07:00
hunteraraujo
fe96664afb Update ApiType Enum to Include Leaderboard 2023-09-20 14:31:27 -07:00
SwiftyOS
d4222519eb Added instructions about cloning and changing dir 2023-09-20 22:50:39 +02:00
SwiftyOS
88f0b04015 fixed grammer 2023-09-20 22:50:39 +02:00
hunteraraujo
cfc6180233 Add BenchmarkRun Class to Model Complete Benchmark Runs
This commit introduces the `BenchmarkRun` class, designed to model a complete benchmark run. The class encapsulates all data and sub-models related to a benchmark, providing a centralized object to handle various aspects of a benchmark run.

The `BenchmarkRun` class includes the following sub-models:
- `RepositoryInfo`: Information about the repository and team.
- `RunDetails`: Specific details like the run identifier, command, and timings.
- `TaskInfo`: Information about the task being benchmarked.
- `Metrics`: Performance metrics for the benchmark run.
- `Config`: Configuration settings for the benchmark run.

A `reachedCutoff` field is also included to indicate whether a certain cutoff was reached during the benchmark run.

Methods for serializing and deserializing the object to and from JSON are also provided.
2023-09-20 13:24:19 -07:00
hunteraraujo
311f69b7cf Add RepositoryInfo Class for Benchmark Repository and Team Details
This commit introduces the RepositoryInfo class, designed to encapsulate details about the repository and team associated with a benchmark run.

The class includes the following fields:
- repoUrl: The URL of the repository where the benchmark code resides.
- teamName: The name of the team responsible for the benchmark.
- benchmarkGitCommitSha: The Git commit SHA for the benchmark code.
- agentGitCommitSha: The Git commit SHA for the agent code.

The class supports JSON serialization and deserialization, making it easy to use with Flutter's JSON handling mechanisms.
2023-09-20 13:17:46 -07:00
hunteraraujo
fc193568b9 Add RunDetails class for encapsulating benchmark run information
Added a new Dart class called `RunDetails` to represent specific details related to a benchmark run.

The class includes fields for:
- The unique run identifier (`runId`)
- The command used to initiate the benchmark (`command`)
- The time the benchmark was completed (`completionTime`)
- The time the benchmark started (`benchmarkStartTime`)
- The name of the test being run (`testName`)

Serialization and deserialization methods are also provided for JSON compatibility.
2023-09-20 13:12:03 -07:00
hunteraraujo
afe77bbc4f Add TaskInfo class with serialization and documentation
Added a new TaskInfo class to encapsulate information related to a specific benchmark task.

- The TaskInfo class holds attributes like the data file path, regression status, task categories, task details, expected answer, and description.
- Included methods for JSON serialization and deserialization.
- Added comprehensive documentation to describe the purpose, properties, and methods of the TaskInfo class.
2023-09-20 13:07:54 -07:00
hunteraraujo
50ef7b31eb Add Metrics class with serialization and documentation
Added a new Metrics class to represent key performance metrics of a benchmark test run.

- The Metrics class encapsulates various data points like difficulty, success rate, attempted status, success percentage, cost, and runtime.
- Included serialization and deserialization methods for converting between Metrics objects and JSON.
- Added comprehensive documentation to describe the purpose, properties, and methods of the Metrics class.
2023-09-20 13:04:47 -07:00
hunteraraujo
39f8ae515b Add Config Class for Benchmark Configuration Management
This commit introduces a new `Config` class, designed to manage and store configuration settings related to the benchmark run. The class contains two key fields:

1. `agentBenchmarkConfigPath`: The path to the agent's benchmark configuration file.
2. `host`: The address of the host where the benchmark is running.

The class includes methods for serialization and deserialization, allowing easy conversion between `Config` objects and JSON maps.

Documentation comments have also been added for better code readability and understanding.
2023-09-20 13:00:22 -07:00
SwiftyOS
c72a35e92e Added blueprint of an agent tutorial 2023-09-20 17:29:14 +02:00
SwiftyOS
7e65df3f39 Changed repos stats to run daily 2023-09-20 16:46:03 +02:00
SwiftyOS
4d629960bb renamed skills -> abilities 2023-09-20 16:45:47 +02:00
SwiftyOS
9c4617eefa Added the getting started tutorial 2023-09-20 16:45:32 +02:00
SwiftyOS
b952d0d2e0 Updated server endpoint message 2023-09-20 16:24:18 +02:00
SwiftyOS
55bcb99e91 Edited the cron to run every 90mins 2023-09-20 13:23:35 +02:00
SwiftyOS
6dcee70eab Added repo stats 2023-09-20 12:53:29 +02:00
SwiftyOS
8fdccfa05a Added outline for memory tutorial 2023-09-20 12:41:04 +02:00
SwiftyOS
4f002d66be Added outline for skills tutorial 2023-09-20 12:40:52 +02:00
SwiftyOS
93be3f54e3 Adding outline of the planning tutorial 2023-09-20 11:48:42 +02:00
SwiftyOS
309a6af359 Added outline of benchmarking tutorial 2023-09-20 11:48:05 +02:00
SwiftyOS
585ba1a1fd Add outline of agent overview tutorial 2023-09-20 11:47:36 +02:00
SwiftyOS
c707cec362 Added outline of tutorial 1 2023-09-20 11:46:55 +02:00
SwiftyOS
edcd103958 Added llm functions 2023-09-20 09:57:10 +02:00
Swifty
8897e47691 Update frontend build (#5270)
Co-authored-by: GitHub Action <action@github.com>
2023-09-20 09:53:00 +02:00
hunteraraujo
377d0af228 Refactor SkillTreeViewModel and Update TaskQueueView UI for Task Status (#5269)
* Refactor SkillTreeViewModel and Update TaskQueueView UI for Task Status

* Notify UI when updating benchmark status
2023-09-19 23:30:22 -07:00
hunteraraujo
99035103e0 Rename benchmark_service directory to benchmark 2023-09-19 22:16:58 -07:00
hunteraraujo
525571c32e Enhance runBenchmark with TestSuite Tracking (#5268) 2023-09-19 21:31:02 -07:00
hunteraraujo
80682b41cb Add Early Termination to runBenchmark on Benchmark Failure (#5267) 2023-09-19 20:24:52 -07:00
hunteraraujo
a37b486227 Enhance SkillTreeViewModel to Manage Benchmark Status (#5266)
Enhance SkillTreeViewModel to Manage Benchmark Execution and Status
2023-09-19 20:20:31 -07:00
Reinier van der Leer
0ca003d858 AutoGPT: Deprecate MessageHistory 2023-09-20 02:40:35 +02:00
hunteraraujo
f130aa7972 Correct triggerEvaluation endpoint 2023-09-19 17:19:59 -07:00
hunteraraujo
5afab461ee Refactor Benchmarking Workflow and Introduce New Data Models (#5264)
* New benchmark data models

* Update _benchmarkBaseUrl

* Remove ReportRequestBody

* Update benchmark service methods for proxy approach

* Add eval id to SkillNodeData

* Refactor runBenchmark Method for proxy approach
2023-09-19 17:01:15 -07:00
SwiftyOS
2098e192da Removed additional refs to frontend 2023-09-19 15:09:51 +02:00
SwiftyOS
cc7476656f removed frontend command from the cli 2023-09-19 15:08:26 +02:00
SwiftyOS
fa265fdf25 Updated quickstart 2023-09-19 15:02:06 +02:00
SwiftyOS
08db74b8ee Updated the forge readme 2023-09-19 14:53:53 +02:00
SwiftyOS
aa1a65c59c Updated forge to server the frontend again 2023-09-19 13:24:06 +02:00
Swifty
ccd0eb800b Update frontend build (#5258)
Co-authored-by: GitHub Action <action@github.com>
2023-09-19 13:06:20 +02:00
SwiftyOS
360ce60b83 commened out create PR bit 2023-09-19 13:04:57 +02:00
SwiftyOS
172d256e15 Switched pull request step 2023-09-19 12:57:49 +02:00
SwiftyOS
2c187b66b7 More messing with the action 2023-09-19 12:50:44 +02:00
SwiftyOS
9a94ce31d8 Testing PR creation 2023-09-19 12:44:21 +02:00
SwiftyOS
c7f4bd265d Changed to push to a branch and make a pr 2023-09-19 12:35:04 +02:00