Commit Graph

32 Commits

Author SHA1 Message Date
hunteraraujo
ec03170e6e Refactor benchmark data models with placeholder values 2023-09-27 15:16:34 -07:00
hunteraraujo
3d4307a848 Added SkillTreeType enum and implemented dropdown selection in SkillTreeView
- Introduced a new `SkillTreeType` enum to represent different skill tree categories: General, Coding, Data, and Scrape/Synthesize.
- Extended the `SkillTreeType` enum to provide associated string values and JSON file names for each category.
- Refactored the `SkillTreeViewModel` to reload the skill tree data based on the selected category.
- Enhanced `SkillTreeView` by adding a positioned dropdown in the top-left corner to allow users to select and load different skill tree categories dynamically.
2023-09-25 23:08:24 -07:00
hunteraraujo
470cfa6c4e Fix issue with decoding metrics JSON 2023-09-24 22:30:37 -07:00
hunteraraujo
18333fbc7c Temporarily allow null values in benchmark data models 2023-09-22 13:48:03 -07:00
hunteraraujo
fe96664afb Update ApiType Enum to Include Leaderboard 2023-09-20 14:31:27 -07:00
hunteraraujo
cfc6180233 Add BenchmarkRun Class to Model Complete Benchmark Runs
This commit introduces the `BenchmarkRun` class, designed to model a complete benchmark run. The class encapsulates all data and sub-models related to a benchmark, providing a centralized object to handle various aspects of a benchmark run.

The `BenchmarkRun` class includes the following sub-models:
- `RepositoryInfo`: Information about the repository and team.
- `RunDetails`: Specific details like the run identifier, command, and timings.
- `TaskInfo`: Information about the task being benchmarked.
- `Metrics`: Performance metrics for the benchmark run.
- `Config`: Configuration settings for the benchmark run.

A `reachedCutoff` field is also included to indicate whether a certain cutoff was reached during the benchmark run.

Methods for serializing and deserializing the object to and from JSON are also provided.
2023-09-20 13:24:19 -07:00
hunteraraujo
311f69b7cf Add RepositoryInfo Class for Benchmark Repository and Team Details
This commit introduces the RepositoryInfo class, designed to encapsulate details about the repository and team associated with a benchmark run.

The class includes the following fields:
- repoUrl: The URL of the repository where the benchmark code resides.
- teamName: The name of the team responsible for the benchmark.
- benchmarkGitCommitSha: The Git commit SHA for the benchmark code.
- agentGitCommitSha: The Git commit SHA for the agent code.

The class supports JSON serialization and deserialization, making it easy to use with Flutter's JSON handling mechanisms.
2023-09-20 13:17:46 -07:00
hunteraraujo
fc193568b9 Add RunDetails class for encapsulating benchmark run information
Added a new Dart class called `RunDetails` to represent specific details related to a benchmark run.

The class includes fields for:
- The unique run identifier (`runId`)
- The command used to initiate the benchmark (`command`)
- The time the benchmark was completed (`completionTime`)
- The time the benchmark started (`benchmarkStartTime`)
- The name of the test being run (`testName`)

Serialization and deserialization methods are also provided for JSON compatibility.
2023-09-20 13:12:03 -07:00
hunteraraujo
afe77bbc4f Add TaskInfo class with serialization and documentation
Added a new TaskInfo class to encapsulate information related to a specific benchmark task.

- The TaskInfo class holds attributes like the data file path, regression status, task categories, task details, expected answer, and description.
- Included methods for JSON serialization and deserialization.
- Added comprehensive documentation to describe the purpose, properties, and methods of the TaskInfo class.
2023-09-20 13:07:54 -07:00
hunteraraujo
50ef7b31eb Add Metrics class with serialization and documentation
Added a new Metrics class to represent key performance metrics of a benchmark test run.

- The Metrics class encapsulates various data points like difficulty, success rate, attempted status, success percentage, cost, and runtime.
- Included serialization and deserialization methods for converting between Metrics objects and JSON.
- Added comprehensive documentation to describe the purpose, properties, and methods of the Metrics class.
2023-09-20 13:04:47 -07:00
hunteraraujo
39f8ae515b Add Config Class for Benchmark Configuration Management
This commit introduces a new `Config` class, designed to manage and store configuration settings related to the benchmark run. The class contains two key fields:

1. `agentBenchmarkConfigPath`: The path to the agent's benchmark configuration file.
2. `host`: The address of the host where the benchmark is running.

The class includes methods for serialization and deserialization, allowing easy conversion between `Config` objects and JSON maps.

Documentation comments have also been added for better code readability and understanding.
2023-09-20 13:00:22 -07:00
hunteraraujo
377d0af228 Refactor SkillTreeViewModel and Update TaskQueueView UI for Task Status (#5269)
* Refactor SkillTreeViewModel and Update TaskQueueView UI for Task Status

* Notify UI when updating benchmark status
2023-09-19 23:30:22 -07:00
hunteraraujo
99035103e0 Rename benchmark_service directory to benchmark 2023-09-19 22:16:58 -07:00
hunteraraujo
5afab461ee Refactor Benchmarking Workflow and Introduce New Data Models (#5264)
* New benchmark data models

* Update _benchmarkBaseUrl

* Remove ReportRequestBody

* Update benchmark service methods for proxy approach

* Add eval id to SkillNodeData

* Refactor runBenchmark Method for proxy approach
2023-09-19 17:01:15 -07:00
hunteraraujo
5814c5a365 Change mock property to be required in ReportRequestBody 2023-09-18 19:46:56 -07:00
hunteraraujo
da9fd926c8 Refactor ReportRequestBody for a single test 2023-09-18 17:09:23 -07:00
hunteraraujo
1d735caf40 Add TestSuite Model with Serialization and Deserialization Support
This commit introduces a new class, TestSuite, designed to encapsulate a collection of Task objects under a common timestamp. This will help in grouping tasks that belong to a particular test suite.

Key Features:
- Add a TestSuite class with fields for `timestamp` and a list of `tests` (Task objects).
- Implement `toJson` method for serializing TestSuite objects to JSON-compatible format.
- Implement `fromJson` factory method for deserializing JSON data back into a TestSuite object.

By providing serialization and deserialization support directly in the model, we facilitate easier storage and data exchange for test suites.
2023-09-18 14:41:25 -07:00
hunteraraujo
e446d723ee Extend Task Model to Include Serialization
This commit adds serialization support to the Task model by including a `toJson` method. This will allow easy conversion of Task objects to a JSON-compatible format, facilitating storage or network transmission.
2023-09-18 14:35:34 -07:00
hunteraraujo
7f5c50dfeb Extend ReportRequestBody to Include "mock" Boolean Field
This commit adds a new boolean field, "mock", to the `ReportRequestBody` class. This additional field is in line with the new requirements to specify whether the report is a mock or not.

The `toJson()` method is also updated to include this new field during serialization.
2023-09-16 10:39:33 -07:00
hunteraraujo
595a892f71 Introduce ApiType enum for API selection
This commit adds a new enum, `ApiType`, to allow dynamic selection between different base URLs for API calls. The enum has two values: `agent` and `benchmark`, corresponding to different services.

The `ApiType` enum is designed to be passed as a parameter to the `RestApiUtility` methods, enabling the utility to decide which base URL to use for HTTP requests.
2023-09-14 20:02:36 -07:00
hunteraraujo
8e25cd2391 Add ReportRequestBody model with JSON serialization
This commit introduces a new Dart class, `ReportRequestBody`, which represents the request body for generating reports in the `BenchmarkService`. The class includes a `toJson` method for easy serialization to JSON format.

- `category`: Specifies the category of the report (e.g., "coding").
- `tests`: A list of tests to be included in the report.
2023-09-14 20:01:05 -07:00
hunteraraujo
30934f400a Enhance SkillTreeNode Model to Include Additional Attributes
This commit extends the SkillTreeNode class to incorporate new attributes such as 'data', 'label', and 'shape', making the model more comprehensive. The JSON deserialization is also updated to handle optional or missing fields by providing default values, improving the robustness of the model.
2023-09-13 17:32:23 -07:00
hunteraraujo
774ccc4ed2 Refactor SkillNodeData model for robust JSON deserialization
This commit updates the SkillNodeData class to handle optional or missing JSON fields more robustly. Now, the model provides default values for each field, ensuring that the object can be instantiated successfully even if some JSON fields are missing or set to null.
2023-09-13 17:31:00 -07:00
hunteraraujo
3c35cab55e Enhance Info model to handle optional JSON fields gracefully
This commit updates the Info class to provide default values for optional or missing fields in the JSON payload. This ensures that the model can be successfully instantiated even when some JSON fields are absent or set to null.
2023-09-13 17:30:41 -07:00
hunteraraujo
5e2e7a11c3 Update Ground model to handle optional JSON fields
This commit modifies the Ground class to make it more robust against optional or missing fields in the incoming JSON data. Default values have been added to ensure that the model can be instantiated even if some JSON fields are missing or set to null.
2023-09-13 17:29:05 -07:00
hunteraraujo
a6b791c4f0 Update SkillTreeNode data model for skill tree
The SkillTreeNode model represents a single node in the skill tree.
It includes:
- Node ID
- Node color
2023-09-10 13:58:02 -07:00
hunteraraujo
e16e48f893 Add SkillTreeEdge data model for skill tree
The SkillTreeEdge model represents the relationship between different skill nodes.
It includes:
- Edge ID
- Source node ID
- Destination node ID
- Arrows property to indicate directionality
2023-09-10 13:57:25 -07:00
hunteraraujo
5726613dfb Add SkillNodeData data model for skill tree
The SkillNodeData model aggregates various data related to a skill node.
It includes:
- Node name
- Node category
- Associated task
- Dependencies
- Cutoff value
- Ground object for evaluation details
- Info object for metadata
2023-09-10 13:56:59 -07:00
hunteraraujo
5ed6a08c22 Add Info data model for skill tree
The Info data model holds metadata about a skill node.
It includes:
- The difficulty level of the skill node
- A description of the skill node
- A list of potential side effects related to the skill node
2023-09-10 13:52:02 -07:00
hunteraraujo
e13f7ca757 Add Ground data model for skill tree
The Ground data model stores evaluation information for each skill node.
It includes:
- The answer to be evaluated
- A list of terms that should be contained in the answer
- A list of terms that should not be contained in the answer
- A list of associated files
- A map for additional evaluation criteria
2023-09-10 13:51:36 -07:00
hunteraraujo
a7c37da713 Make input and additionalInput optional in StepRequestBody
Updated the StepRequestBody class to allow both 'input' and 'additionalInput' to be optional. Added logic in toJson() method to return an empty JSON object if both fields are null.
2023-09-06 11:41:23 -07:00
hunteraraujo
ef2d64513b Merge commit 'e5d30a9f6d0854e20049309333c2f637cd03025c' as 'frontend' 2023-09-06 11:22:37 -07:00