This commit incorporates significant enhancements to the SkillTreeViewModel, introducing the ability to track current benchmark runs and submit results to the leaderboard. A new list, `currentBenchmarkRuns`, is introduced to store each benchmark run object during a specific benchmark session. This list is reset to an empty state when initiating a new benchmark.
Changes made:
- Introduced `currentBenchmarkRuns` to track ongoing benchmark runs, ensuring real-time data availability.
- Enhanced `runBenchmark` method to populate `currentBenchmarkRuns` with benchmark run objects as the benchmark progresses.
- Implemented `submitToLeaderboard` method, accepting parameters `teamName`, `repoUrl`, and `agentGitCommitSha`, and updating each run object with this information. All runs share a common UUID generated at the beginning of the submission process.
These enhancements ensure that benchmark run data is readily available and organized, facilitating a streamlined process for submitting well-structured data to the leaderboard. It fosters a more interactive and informative user experience, offering insights into each benchmark run's progress and outcomes.
- Introduced a new `SkillTreeType` enum to represent different skill tree categories: General, Coding, Data, and Scrape/Synthesize.
- Extended the `SkillTreeType` enum to provide associated string values and JSON file names for each category.
- Refactored the `SkillTreeViewModel` to reload the skill tree data based on the selected category.
- Enhanced `SkillTreeView` by adding a positioned dropdown in the top-left corner to allow users to select and load different skill tree categories dynamically.
This commit refines the `populateSelectedNodeHierarchy` method in the `SkillTreeViewModel` to accurately represent nodes that possess multiple parent nodes, ensuring a comprehensive and non-redundant representation of the entire hierarchy leading back to the root nodes.
Modifications and Features:
- The method now employs recursion to traverse all possible paths back to the root nodes from a selected node, capturing every unique node in the hierarchies.
- A `Set` is utilized to monitor and ensure that each node is only added once to the `_selectedNodeHierarchy` list, eliminating the possibility of duplicates.
- The finalized `_selectedNodeHierarchy` list is constructed such that the root of the tree is the last item in the list, providing a more logical representation of the hierarchy.
These enhancements ensure a more accurate and efficient representation of the skill tree structure, particularly in scenarios where nodes have multiple parents, facilitating better navigation and interaction within the skill tree.
This commit integrates the `LeaderboardService` into `SkillTreeViewModel` to enable benchmark report submissions to the leaderboard. A `BenchmarkRun` object is created from the evaluation response and submitted using the `submitReport` method from `LeaderboardService`.
* New benchmark data models
* Update _benchmarkBaseUrl
* Remove ReportRequestBody
* Update benchmark service methods for proxy approach
* Add eval id to SkillNodeData
* Refactor runBenchmark Method for proxy approach
This commit updates the runBenchmark method in the SkillTreeViewModel class to align with the new report generation flow. The updated method does the following:
1. Checks if a benchmark is already running to prevent overlapping runs.
2. Sets a flag to indicate that the benchmark is running and notifies the UI.
3. Reverses the selected node hierarchy for report generation.
4. Loops through each node in the reversed hierarchy to:
- Generate a unique UUID for each test run.
- Create a ReportRequestBody object.
- Call the generateSingleReport method in the BenchmarkService.
- Update the UI after each single report is generated.
5. After all single reports are generated, it calls the generateCombinedReport method in the BenchmarkService, passing in all the generated UUIDs.
6. Finally, it sets the benchmark running flag to false and notifies the UI.
This change improves the report generation flow and allows for both individual and combined reports.
Added a state variable isBenchmarkRunning in SkillTreeViewModel to track the status of benchmark execution. This state variable is used to conditionally disable specific UI components:
- The "Initiate test suite" button in TaskQueueView is disabled during the benchmark.
- All IconButtons in SideBarView are disabled during the benchmark.
- Node selection in SkillTreeView is disabled during the benchmark.
This ensures that the user cannot interact with these components while a benchmark test is running, thereby improving UX and preventing potential issues.
This commit extends the `SkillTreeViewModel` to include `BenchmarkService` as a dependency. This integration allows for leveraging benchmark-related API calls within the skill tree logic.
Two new methods have been added to `SkillTreeViewModel`:
1. `callGenerateReport`: This method attempts to call the `generateReport` function from the `BenchmarkService`. Currently, it only prints the API response and is incomplete in terms of full functionality.
2. `callPollUpdates`: Similar to `callGenerateReport`, this method aims to call `pollUpdates` from `BenchmarkService` and prints the API response. This is also incomplete and will require further development.
Both methods are preliminary and will require additional features to become fully functional.
This commit substantially upgrades the SkillTreeViewModel by incorporating asynchronous initialization from a JSON asset. Now, both nodes and edges of the skill tree are dynamically generated based on the JSON data. This not only enhances the modularity of the code but also simplifies the process of updating or modifying the skill tree.
Other improvements include:
- Changed node IDs from integers to strings for better flexibility.
- Added a function to get a node by its ID, improving code reusability.
- Introduced error handling for potential issues during JSON parsing or node retrieval.
- Updated the sibling, level, and subtree separation configurations for the graph view layout.
These changes make the skill tree more dynamic and maintainable, setting the stage for future extensions.
* Add TestQueueView to Main Layout
This commit integrates the TestQueueView into the main layout. The layout now conditionally displays the TestQueueView based on whether a node in the SkillTree is selected.
- TestQueueView appears when a SkillTree node is selected.
- Main layout adjusts to accommodate TestQueueView alongside SkillTreeView and ChatView.
- Implemented responsive layout logic to manage the widths of the different views based on the screen width and the state of the SkillTree.
* Extend SkillTreeViewModel to Track Selected Node Hierarchy
This commit enhances the SkillTreeViewModel to maintain a list of nodes that form a hierarchy from the currently selected node to the root. This allows for more interactive and informative views that can leverage this hierarchical data.
- Added a new property `selectedNodeHierarchy` to keep track of the node hierarchy.
- Modified the `toggleNodeSelection` method to populate or clear `selectedNodeHierarchy` based on node selection.
- Introduced a new method `populateSelectedNodeHierarchy` to build the hierarchy from the selected node to the root.
* Extract skill tree view model reset state to method
* Implement UI enhancements for TaskQueueView
This commit introduces several UI improvements to the TaskQueueView:
- Tiles are padded 20 units from both the leading and trailing edges.
- Tiles now have a white background.
- Added a thin black border to the tiles.
- Incorporated a slight corner radius for the tiles.
- Centered the title and subtitle horizontally within the tiles.
- Added a checkmark button with a tooltip at the bottom-right corner for running a suite of tests.
These changes aim to improve the user experience and visual appeal of the TaskQueueView.
* Make MainLayout a consumer of SkillTreeViewModel
The SkillTreeViewModel class serves as the view model for the skill tree and extends Flutter's ChangeNotifier for state management.
Features include:
- Storing and managing the list of SkillTreeNodes and SkillTreeEdges.
- Managing the state of the selected node.
- Initializing the skill tree with predefined nodes and edges.
- Methods for toggling node selection, allowing for only a single node to be selected at any given time.
The view model utilizes the GraphView package for visualization and layout.