Commit Graph

29 Commits

Author SHA1 Message Date
hunteraraujo
a7e27d1a64 Remove duplicate functionality from SkillTreeViewModel 2023-10-08 22:33:21 -07:00
hunteraraujo
b2d53d8d18 Introduce TestOption Enum for Enhanced Test Selection Clarity (#5586) 2023-10-06 16:13:33 -07:00
hunteraraujo
45c15fc8c6 Enhanced Test Execution Flexibility (#5521)
Refactor UI and Logic for Task Queue and Test Suite Button
2023-10-03 22:34:22 -07:00
hunteraraujo
d91236deda Fix bug where benchmarks are not showing chat convo 2023-09-28 23:59:14 -07:00
hunteraraujo
41f0b472c0 Remove dark mode toggle until implementation is completed 2023-09-28 00:00:07 -07:00
hunteraraujo
ee55b85945 Fix bug where we are not passing benchmark input 2023-09-27 23:43:44 -07:00
hunteraraujo
39e4b7e03f Fix bug where failed benchmark is not being added to test suite 2023-09-27 17:26:19 -07:00
hunteraraujo
122c996714 Add helpful print statements and fix isURL validation 2023-09-27 15:46:05 -07:00
hunteraraujo
e4d84dad0a Enhance SkillTreeViewModel with Benchmark Runs Tracking and Leaderboard Submission
This commit incorporates significant enhancements to the SkillTreeViewModel, introducing the ability to track current benchmark runs and submit results to the leaderboard. A new list, `currentBenchmarkRuns`, is introduced to store each benchmark run object during a specific benchmark session. This list is reset to an empty state when initiating a new benchmark.

Changes made:
- Introduced `currentBenchmarkRuns` to track ongoing benchmark runs, ensuring real-time data availability.
- Enhanced `runBenchmark` method to populate `currentBenchmarkRuns` with benchmark run objects as the benchmark progresses.
- Implemented `submitToLeaderboard` method, accepting parameters `teamName`, `repoUrl`, and `agentGitCommitSha`, and updating each run object with this information. All runs share a common UUID generated at the beginning of the submission process.

These enhancements ensure that benchmark run data is readily available and organized, facilitating a streamlined process for submitting well-structured data to the leaderboard. It fosters a more interactive and informative user experience, offering insights into each benchmark run's progress and outcomes.
2023-09-27 15:14:48 -07:00
hunteraraujo
d056793309 Add placeholder submitToLeaderboard 2023-09-26 16:13:38 -07:00
hunteraraujo
3d4307a848 Added SkillTreeType enum and implemented dropdown selection in SkillTreeView
- Introduced a new `SkillTreeType` enum to represent different skill tree categories: General, Coding, Data, and Scrape/Synthesize.
- Extended the `SkillTreeType` enum to provide associated string values and JSON file names for each category.
- Refactored the `SkillTreeViewModel` to reload the skill tree data based on the selected category.
- Enhanced `SkillTreeView` by adding a positioned dropdown in the top-left corner to allow users to select and load different skill tree categories dynamically.
2023-09-25 23:08:24 -07:00
hunteraraujo
ecc8d9430c Enhance Hierarchy Population to Support Nodes with Multiple Parents
This commit refines the `populateSelectedNodeHierarchy` method in the `SkillTreeViewModel` to accurately represent nodes that possess multiple parent nodes, ensuring a comprehensive and non-redundant representation of the entire hierarchy leading back to the root nodes.

Modifications and Features:
- The method now employs recursion to traverse all possible paths back to the root nodes from a selected node, capturing every unique node in the hierarchies.
- A `Set` is utilized to monitor and ensure that each node is only added once to the `_selectedNodeHierarchy` list, eliminating the possibility of duplicates.
- The finalized `_selectedNodeHierarchy` list is constructed such that the root of the tree is the last item in the list, providing a more logical representation of the hierarchy.

These enhancements ensure a more accurate and efficient representation of the skill tree structure, particularly in scenarios where nodes have multiple parents, facilitating better navigation and interaction within the skill tree.
2023-09-22 17:26:47 -07:00
hunteraraujo
45819e68d0 Use SugiyamaAlgorithm instead of BuchheimWalkerAlgorithm for skill tree 2023-09-22 16:20:04 -07:00
hunteraraujo
22ea449850 Integrate LeaderboardService into SkillTreeViewModel
This commit integrates the `LeaderboardService` into `SkillTreeViewModel` to enable benchmark report submissions to the leaderboard. A `BenchmarkRun` object is created from the evaluation response and submitted using the `submitReport` method from `LeaderboardService`.
2023-09-20 19:36:25 -07:00
hunteraraujo
377d0af228 Refactor SkillTreeViewModel and Update TaskQueueView UI for Task Status (#5269)
* Refactor SkillTreeViewModel and Update TaskQueueView UI for Task Status

* Notify UI when updating benchmark status
2023-09-19 23:30:22 -07:00
hunteraraujo
99035103e0 Rename benchmark_service directory to benchmark 2023-09-19 22:16:58 -07:00
hunteraraujo
525571c32e Enhance runBenchmark with TestSuite Tracking (#5268) 2023-09-19 21:31:02 -07:00
hunteraraujo
80682b41cb Add Early Termination to runBenchmark on Benchmark Failure (#5267) 2023-09-19 20:24:52 -07:00
hunteraraujo
a37b486227 Enhance SkillTreeViewModel to Manage Benchmark Status (#5266)
Enhance SkillTreeViewModel to Manage Benchmark Execution and Status
2023-09-19 20:20:31 -07:00
hunteraraujo
5afab461ee Refactor Benchmarking Workflow and Introduce New Data Models (#5264)
* New benchmark data models

* Update _benchmarkBaseUrl

* Remove ReportRequestBody

* Update benchmark service methods for proxy approach

* Add eval id to SkillNodeData

* Refactor runBenchmark Method for proxy approach
2023-09-19 17:01:15 -07:00
hunteraraujo
bf03dd8739 Refactor runBenchmark in SkillTreeViewModel for New Report Generation Flow
This commit updates the runBenchmark method in the SkillTreeViewModel class to align with the new report generation flow. The updated method does the following:

1. Checks if a benchmark is already running to prevent overlapping runs.
2. Sets a flag to indicate that the benchmark is running and notifies the UI.
3. Reverses the selected node hierarchy for report generation.
4. Loops through each node in the reversed hierarchy to:
  - Generate a unique UUID for each test run.
  - Create a ReportRequestBody object.
  - Call the generateSingleReport method in the BenchmarkService.
  - Update the UI after each single report is generated.

5. After all single reports are generated, it calls the generateCombinedReport method in the BenchmarkService, passing in all the generated UUIDs.

6. Finally, it sets the benchmark running flag to false and notifies the UI.

This change improves the report generation flow and allows for both individual and combined reports.
2023-09-18 19:55:01 -07:00
hunteraraujo
4463f75756 Fix issue where side bar view is not disabled 2023-09-16 22:19:42 -07:00
hunteraraujo
60ae12dfd5 Implement UI Disable Feature During Benchmark Run
Added a state variable isBenchmarkRunning in SkillTreeViewModel to track the status of benchmark execution. This state variable is used to conditionally disable specific UI components:

- The "Initiate test suite" button in TaskQueueView is disabled during the benchmark.
- All IconButtons in SideBarView are disabled during the benchmark.
- Node selection in SkillTreeView is disabled during the benchmark.

This ensures that the user cannot interact with these components while a benchmark test is running, thereby improving UX and preventing potential issues.
2023-09-16 19:24:54 -07:00
hunteraraujo
6b921b5eda Refactor test suite button + rename method to runBenchmark 2023-09-16 18:56:42 -07:00
hunteraraujo
a97e0dbe62 Integrate BenchmarkService into SkillTreeViewModel with Incomplete Methods
This commit extends the `SkillTreeViewModel` to include `BenchmarkService` as a dependency. This integration allows for leveraging benchmark-related API calls within the skill tree logic.

Two new methods have been added to `SkillTreeViewModel`:

1. `callGenerateReport`: This method attempts to call the `generateReport` function from the `BenchmarkService`. Currently, it only prints the API response and is incomplete in terms of full functionality.

2. `callPollUpdates`: Similar to `callGenerateReport`, this method aims to call `pollUpdates` from `BenchmarkService` and prints the API response. This is also incomplete and will require further development.

Both methods are preliminary and will require additional features to become fully functional.
2023-09-14 20:07:18 -07:00
hunteraraujo
3bba27dd3c Integrate JSON-based Skill Tree Initialization in ViewModel
This commit substantially upgrades the SkillTreeViewModel by incorporating asynchronous initialization from a JSON asset. Now, both nodes and edges of the skill tree are dynamically generated based on the JSON data. This not only enhances the modularity of the code but also simplifies the process of updating or modifying the skill tree.

Other improvements include:
- Changed node IDs from integers to strings for better flexibility.
- Added a function to get a node by its ID, improving code reusability.
- Introduced error handling for potential issues during JSON parsing or node retrieval.
- Updated the sibling, level, and subtree separation configurations for the graph view layout.

These changes make the skill tree more dynamic and maintainable, setting the stage for future extensions.
2023-09-13 17:53:24 -07:00
hunteraraujo
314cce75b5 Integrate TaskQueueView and Enhance SkillTree Functionality (#5206)
* Add TestQueueView to Main Layout

This commit integrates the TestQueueView into the main layout. The layout now conditionally displays the TestQueueView based on whether a node in the SkillTree is selected.

- TestQueueView appears when a SkillTree node is selected.
- Main layout adjusts to accommodate TestQueueView alongside SkillTreeView and ChatView.
- Implemented responsive layout logic to manage the widths of the different views based on the screen width and the state of the SkillTree.

* Extend SkillTreeViewModel to Track Selected Node Hierarchy

This commit enhances the SkillTreeViewModel to maintain a list of nodes that form a hierarchy from the currently selected node to the root. This allows for more interactive and informative views that can leverage this hierarchical data.

- Added a new property `selectedNodeHierarchy` to keep track of the node hierarchy.
- Modified the `toggleNodeSelection` method to populate or clear `selectedNodeHierarchy` based on node selection.
- Introduced a new method `populateSelectedNodeHierarchy` to build the hierarchy from the selected node to the root.

* Extract skill tree view model reset state to method

* Implement UI enhancements for TaskQueueView

This commit introduces several UI improvements to the TaskQueueView:
- Tiles are padded 20 units from both the leading and trailing edges.
- Tiles now have a white background.
- Added a thin black border to the tiles.
- Incorporated a slight corner radius for the tiles.
- Centered the title and subtitle horizontally within the tiles.
- Added a checkmark button with a tooltip at the bottom-right corner for running a suite of tests.

These changes aim to improve the user experience and visual appeal of the TaskQueueView.

* Make MainLayout a consumer of SkillTreeViewModel
2023-09-12 14:01:32 -07:00
hunteraraujo
e1a5a2a481 Clear skill tree state when initializing tree 2023-09-10 15:12:52 -07:00
hunteraraujo
d6b0894c6b Add SkillTreeViewModel for managing skill tree state
The SkillTreeViewModel class serves as the view model for the skill tree and extends Flutter's ChangeNotifier for state management.

Features include:
- Storing and managing the list of SkillTreeNodes and SkillTreeEdges.
- Managing the state of the selected node.
- Initializing the skill tree with predefined nodes and edges.
- Methods for toggling node selection, allowing for only a single node to be selected at any given time.

The view model utilizes the GraphView package for visualization and layout.
2023-09-10 14:28:17 -07:00