Added a new Dart class called `RunDetails` to represent specific details related to a benchmark run.
The class includes fields for:
- The unique run identifier (`runId`)
- The command used to initiate the benchmark (`command`)
- The time the benchmark was completed (`completionTime`)
- The time the benchmark started (`benchmarkStartTime`)
- The name of the test being run (`testName`)
Serialization and deserialization methods are also provided for JSON compatibility.
Added a new TaskInfo class to encapsulate information related to a specific benchmark task.
- The TaskInfo class holds attributes like the data file path, regression status, task categories, task details, expected answer, and description.
- Included methods for JSON serialization and deserialization.
- Added comprehensive documentation to describe the purpose, properties, and methods of the TaskInfo class.
Added a new Metrics class to represent key performance metrics of a benchmark test run.
- The Metrics class encapsulates various data points like difficulty, success rate, attempted status, success percentage, cost, and runtime.
- Included serialization and deserialization methods for converting between Metrics objects and JSON.
- Added comprehensive documentation to describe the purpose, properties, and methods of the Metrics class.
This commit introduces a new `Config` class, designed to manage and store configuration settings related to the benchmark run. The class contains two key fields:
1. `agentBenchmarkConfigPath`: The path to the agent's benchmark configuration file.
2. `host`: The address of the host where the benchmark is running.
The class includes methods for serialization and deserialization, allowing easy conversion between `Config` objects and JSON maps.
Documentation comments have also been added for better code readability and understanding.
* New benchmark data models
* Update _benchmarkBaseUrl
* Remove ReportRequestBody
* Update benchmark service methods for proxy approach
* Add eval id to SkillNodeData
* Refactor runBenchmark Method for proxy approach
This commit updates the runBenchmark method in the SkillTreeViewModel class to align with the new report generation flow. The updated method does the following:
1. Checks if a benchmark is already running to prevent overlapping runs.
2. Sets a flag to indicate that the benchmark is running and notifies the UI.
3. Reverses the selected node hierarchy for report generation.
4. Loops through each node in the reversed hierarchy to:
- Generate a unique UUID for each test run.
- Create a ReportRequestBody object.
- Call the generateSingleReport method in the BenchmarkService.
- Update the UI after each single report is generated.
5. After all single reports are generated, it calls the generateCombinedReport method in the BenchmarkService, passing in all the generated UUIDs.
6. Finally, it sets the benchmark running flag to false and notifies the UI.
This change improves the report generation flow and allows for both individual and combined reports.
This commit introduces two major updates to the BenchmarkService class:
1. Renamed the `generateReport` method to `generateSingleReport` for better clarity and specificity.
2. Added a new method called `generateCombinedReport` that takes a list of test run IDs and generates a combined report by posting to the `/reports/query` endpoint.
These changes aim to improve the modularity and readability of the code, while also extending its functionality to handle combined reports.
This commit introduces substantial improvements to the TaskView class to accommodate both tasks and test suites in a unified view. It also integrates the TestSuiteDetailView to display test suite details when a test suite is selected.
Key Enhancements:
1. Modified the `initState` method to call `fetchAndCombineData()` from TaskViewModel, thereby populating the combined data source.
2. Replaced the ListView that was rendering tasks with a ListView that can render both tasks and test suites.
3. Introduced conditional rendering for TestSuiteDetailView when a test suite is selected.
4. Updated onTap actions to select and deselect tasks and test suites appropriately.
5. Moved to using a Stack layout to allow overlay of TestSuiteDetailView on top of the existing layout.
This refactor enhances the TaskView's capabilities to manage and display both tasks and test suites, offering a more integrated user experience.
This commit significantly expands the functionalities of TaskViewModel to manage both tasks and test suites in a unified manner. The view model now serves as the primary business logic class that interacts with the UI for task and test suite management.
Key Enhancements:
- Introduced `_testSuites` list to store TestSuite objects.
- Added `combinedDataSource` to hold both tasks and test suites.
- Introduced `selectTestSuite` and `deselectTestSuite` methods for TestSuite selection management.
- Added methods for TestSuite CRUD operations (`addTestSuite`, `fetchTestSuites`, `_saveTestSuitesToPrefs`).
- Created `fetchAndCombineData` method to fetch and combine tasks and test suites into a single list, `combinedDataSource`.
This update provides a more robust and unified approach for managing tasks and test suites, thereby improving the application's modularity and scalability.
This commit introduces a new StatefulWidget, TestSuiteDetailView, to offer a dedicated view for managing and interacting with individual Test Suites.
Key Features:
- Created a TestSuiteDetailView class that takes a TestSuite object and a TaskViewModel as parameters.
- Added an AppBar with a back button for easy navigation.
- Utilized ListView.builder to display a list of tasks that belong to the selected Test Suite.
- Integrated with existing TaskViewModel to select and delete tasks within the Test Suite.
- Included a Provider for the ChatViewModel to update the current task ID when a task is selected.
This new view enhances the user experience by providing a focused interface for managing tasks within individual Test Suites. This facilitates better organization and navigation for the user.