Changed the architecture of the SettingsView to align with the existing pattern in the application. Instead of creating its own instance of SettingsViewModel, the view now accepts an external viewModel as a parameter. This ensures consistency across the application and allows for better modularity and reusability of components.
This commit enriches the `SettingsViewModel` by integrating state persistence functionality and a sign-out method, thus ensuring a more robust and user-friendly settings feature.
Key Enhancements:
1. **State Persistence**:
- Leveraging the `shared_preferences` package, the app now stores and retrieves user preferences related to app settings, ensuring consistency across app restarts.
- Preferences like Dark Mode, Developer Mode, Base URL, and Continuous Mode Steps are persistently stored and loaded when the ViewModel is initialized.
2. **Sign Out Method**:
- A `signOut` method has been introduced in the ViewModel, utilizing the `AuthService` to facilitate user sign-out processes.
- This addition allows for seamless integration of sign-out functionality within the settings interface, granting users the ability to easily terminate sessions.
3. **SettingsView Enhancement**:
- The `SettingsView` has been adapted to incorporate a UI element invoking the `signOut` method, enhancing user interaction within the settings environment.
This enhancement not only bolsters the resilience and usability of the app’s settings but also lays down a structured approach for potential future additions to user preferences and settings-related functionalities.
This commit integrates the newly created `SettingsView` into the `MainLayout`, allowing users to access and interact with the settings through the main user interface of the app. When the user selects the settings tab from the `SideBarView`, the `SettingsView` is displayed alongside the `ChatView`, maintaining consistent layout aesthetics with other views like `TaskView`.
Key Changes:
- A new condition has been added in the `ValueListenableBuilder` within the `MainLayout` to check if the selected view is `'SettingsView'`.
- When `'SettingsView'` is selected, `SettingsView` is rendered with the same width as `TaskView`, and `ChatView` is displayed next to it.
- The state of the skill tree is reset when navigating to `SettingsView` to ensure a clean state.
This integration ensures a seamless user experience, allowing users to navigate and configure app settings easily and efficiently from the main layout of the app.
This commit augments the `SideBarView` by introducing a new `IconButton` that represents a tab for the newly created `SettingsView`. This enhancement allows users to navigate to the settings page directly from the sidebar, improving accessibility to app configuration options.
Key Changes:
- A new `IconButton` with a settings icon has been added to the sidebar’s `Column` widget.
- Selecting the settings tab updates the `selectedViewNotifier.value` to `'SettingsView'`, indicating the user's intent to navigate to the settings page.
- Visual feedback is provided by altering the color of the settings icon based on whether the settings tab is currently selected.
This addition ensures seamless integration of the `SettingsView` into the existing navigation structure, enabling users to easily configure app settings.
This commit refines the `populateSelectedNodeHierarchy` method in the `SkillTreeViewModel` to accurately represent nodes that possess multiple parent nodes, ensuring a comprehensive and non-redundant representation of the entire hierarchy leading back to the root nodes.
Modifications and Features:
- The method now employs recursion to traverse all possible paths back to the root nodes from a selected node, capturing every unique node in the hierarchies.
- A `Set` is utilized to monitor and ensure that each node is only added once to the `_selectedNodeHierarchy` list, eliminating the possibility of duplicates.
- The finalized `_selectedNodeHierarchy` list is constructed such that the root of the tree is the last item in the list, providing a more logical representation of the hierarchy.
These enhancements ensure a more accurate and efficient representation of the skill tree structure, particularly in scenarios where nodes have multiple parents, facilitating better navigation and interaction within the skill tree.
This commit introduces the `SettingsView` class, which is responsible for rendering the user interface for the settings feature of the app. This View is associated with the `SettingsViewModel` for state management and logic.
Features Represented in this View:
- Dark Mode: A switch to enable or disable Dark Mode.
- Developer Mode: A switch to enable or disable Developer Mode.
- Base URL: A text field to input and configure the Base URL.
- Continuous Mode Steps: User-friendly '+' and '-' buttons allowing users to configure the number of steps for continuous mode.
The `SettingsView` utilizes the Flutter `ListView` to display various settings options, providing an intuitive and user-friendly experience. It employs reactive state management using the `Provider` package, ensuring that any state changes in the associated ViewModel are reflected immediately in the View. Proper documentation and comments have been added for better code readability and maintenance.
This commit introduces the `SettingsViewModel` class, part of the MVVM architecture, responsible for managing the state and business logic for the settings feature of the app. This ViewModel is associated with the `SettingsView`.
Features Managed by this ViewModel:
- Dark Mode: Toggle the state of Dark Mode and notify listeners.
- Developer Mode: Toggle the state of Developer Mode and notify listeners.
- Base URL: Update the state of Base URL and notify listeners.
- Continuous Mode Steps: Increment and decrement the number of Continuous Mode Steps and notify listeners.
Each change in the state is followed by a notification to the listeners to rebuild the UI components that depend on this state, ensuring a reactive UI. To-do comments have been added where the state needs to be persisted or synchronized with a server.
This commit integrates the `LeaderboardService` into `SkillTreeViewModel` to enable benchmark report submissions to the leaderboard. A `BenchmarkRun` object is created from the evaluation response and submitted using the `submitReport` method from `LeaderboardService`.
This commit extends the `RestApiUtility` class to include support for a new leaderboard base URL. A new `ApiType` enum value `ApiType.leaderboard` has been added, and the `_getEffectiveBaseUrl` method has been updated to handle this new type. The leaderboard base URL is "https://leaderboard.vercel.app/".
This commit adds a new `LeaderboardService` class featuring a `submitReport` method. This method allows for the submission of `BenchmarkRun` objects to the leaderboard via a POST request to the `/api/reports` endpoint. The new service uses the `ApiType.leaderboard` enum value.
This commit introduces the `BenchmarkRun` class, designed to model a complete benchmark run. The class encapsulates all data and sub-models related to a benchmark, providing a centralized object to handle various aspects of a benchmark run.
The `BenchmarkRun` class includes the following sub-models:
- `RepositoryInfo`: Information about the repository and team.
- `RunDetails`: Specific details like the run identifier, command, and timings.
- `TaskInfo`: Information about the task being benchmarked.
- `Metrics`: Performance metrics for the benchmark run.
- `Config`: Configuration settings for the benchmark run.
A `reachedCutoff` field is also included to indicate whether a certain cutoff was reached during the benchmark run.
Methods for serializing and deserializing the object to and from JSON are also provided.
This commit introduces the RepositoryInfo class, designed to encapsulate details about the repository and team associated with a benchmark run.
The class includes the following fields:
- repoUrl: The URL of the repository where the benchmark code resides.
- teamName: The name of the team responsible for the benchmark.
- benchmarkGitCommitSha: The Git commit SHA for the benchmark code.
- agentGitCommitSha: The Git commit SHA for the agent code.
The class supports JSON serialization and deserialization, making it easy to use with Flutter's JSON handling mechanisms.
Added a new Dart class called `RunDetails` to represent specific details related to a benchmark run.
The class includes fields for:
- The unique run identifier (`runId`)
- The command used to initiate the benchmark (`command`)
- The time the benchmark was completed (`completionTime`)
- The time the benchmark started (`benchmarkStartTime`)
- The name of the test being run (`testName`)
Serialization and deserialization methods are also provided for JSON compatibility.
Added a new TaskInfo class to encapsulate information related to a specific benchmark task.
- The TaskInfo class holds attributes like the data file path, regression status, task categories, task details, expected answer, and description.
- Included methods for JSON serialization and deserialization.
- Added comprehensive documentation to describe the purpose, properties, and methods of the TaskInfo class.
Added a new Metrics class to represent key performance metrics of a benchmark test run.
- The Metrics class encapsulates various data points like difficulty, success rate, attempted status, success percentage, cost, and runtime.
- Included serialization and deserialization methods for converting between Metrics objects and JSON.
- Added comprehensive documentation to describe the purpose, properties, and methods of the Metrics class.
This commit introduces a new `Config` class, designed to manage and store configuration settings related to the benchmark run. The class contains two key fields:
1. `agentBenchmarkConfigPath`: The path to the agent's benchmark configuration file.
2. `host`: The address of the host where the benchmark is running.
The class includes methods for serialization and deserialization, allowing easy conversion between `Config` objects and JSON maps.
Documentation comments have also been added for better code readability and understanding.
* New benchmark data models
* Update _benchmarkBaseUrl
* Remove ReportRequestBody
* Update benchmark service methods for proxy approach
* Add eval id to SkillNodeData
* Refactor runBenchmark Method for proxy approach
This commit updates the runBenchmark method in the SkillTreeViewModel class to align with the new report generation flow. The updated method does the following:
1. Checks if a benchmark is already running to prevent overlapping runs.
2. Sets a flag to indicate that the benchmark is running and notifies the UI.
3. Reverses the selected node hierarchy for report generation.
4. Loops through each node in the reversed hierarchy to:
- Generate a unique UUID for each test run.
- Create a ReportRequestBody object.
- Call the generateSingleReport method in the BenchmarkService.
- Update the UI after each single report is generated.
5. After all single reports are generated, it calls the generateCombinedReport method in the BenchmarkService, passing in all the generated UUIDs.
6. Finally, it sets the benchmark running flag to false and notifies the UI.
This change improves the report generation flow and allows for both individual and combined reports.
This commit introduces two major updates to the BenchmarkService class:
1. Renamed the `generateReport` method to `generateSingleReport` for better clarity and specificity.
2. Added a new method called `generateCombinedReport` that takes a list of test run IDs and generates a combined report by posting to the `/reports/query` endpoint.
These changes aim to improve the modularity and readability of the code, while also extending its functionality to handle combined reports.
This commit introduces substantial improvements to the TaskView class to accommodate both tasks and test suites in a unified view. It also integrates the TestSuiteDetailView to display test suite details when a test suite is selected.
Key Enhancements:
1. Modified the `initState` method to call `fetchAndCombineData()` from TaskViewModel, thereby populating the combined data source.
2. Replaced the ListView that was rendering tasks with a ListView that can render both tasks and test suites.
3. Introduced conditional rendering for TestSuiteDetailView when a test suite is selected.
4. Updated onTap actions to select and deselect tasks and test suites appropriately.
5. Moved to using a Stack layout to allow overlay of TestSuiteDetailView on top of the existing layout.
This refactor enhances the TaskView's capabilities to manage and display both tasks and test suites, offering a more integrated user experience.
This commit significantly expands the functionalities of TaskViewModel to manage both tasks and test suites in a unified manner. The view model now serves as the primary business logic class that interacts with the UI for task and test suite management.
Key Enhancements:
- Introduced `_testSuites` list to store TestSuite objects.
- Added `combinedDataSource` to hold both tasks and test suites.
- Introduced `selectTestSuite` and `deselectTestSuite` methods for TestSuite selection management.
- Added methods for TestSuite CRUD operations (`addTestSuite`, `fetchTestSuites`, `_saveTestSuitesToPrefs`).
- Created `fetchAndCombineData` method to fetch and combine tasks and test suites into a single list, `combinedDataSource`.
This update provides a more robust and unified approach for managing tasks and test suites, thereby improving the application's modularity and scalability.
This commit introduces a new StatefulWidget, TestSuiteDetailView, to offer a dedicated view for managing and interacting with individual Test Suites.
Key Features:
- Created a TestSuiteDetailView class that takes a TestSuite object and a TaskViewModel as parameters.
- Added an AppBar with a back button for easy navigation.
- Utilized ListView.builder to display a list of tasks that belong to the selected Test Suite.
- Integrated with existing TaskViewModel to select and delete tasks within the Test Suite.
- Included a Provider for the ChatViewModel to update the current task ID when a task is selected.
This new view enhances the user experience by providing a focused interface for managing tasks within individual Test Suites. This facilitates better organization and navigation for the user.
This commit adds a new StatelessWidget, TestSuiteListTile, designed to display individual TestSuite items in a list.
Key Features:
- Created a TestSuiteListTile class that takes a TestSuite object and a VoidCallback for the onTap event as parameters.
- Utilized Material Design with custom styling to ensure the tile fits well within the application's UI.
- The tile displays the timestamp of the TestSuite, which serves as its title.
- Included a play arrow icon to indicate that the tile is actionable.
- Utilized MediaQuery to adapt the tile width based on the screen size, capped at a maximum width of 260.
By adding this widget, we improve the UX by providing a consistent and intuitive way to interact with TestSuite objects in the UI.
This commit introduces a new class, TestSuite, designed to encapsulate a collection of Task objects under a common timestamp. This will help in grouping tasks that belong to a particular test suite.
Key Features:
- Add a TestSuite class with fields for `timestamp` and a list of `tests` (Task objects).
- Implement `toJson` method for serializing TestSuite objects to JSON-compatible format.
- Implement `fromJson` factory method for deserializing JSON data back into a TestSuite object.
By providing serialization and deserialization support directly in the model, we facilitate easier storage and data exchange for test suites.
This commit adds serialization support to the Task model by including a `toJson` method. This will allow easy conversion of Task objects to a JSON-compatible format, facilitating storage or network transmission.
Added a state variable isBenchmarkRunning in SkillTreeViewModel to track the status of benchmark execution. This state variable is used to conditionally disable specific UI components:
- The "Initiate test suite" button in TaskQueueView is disabled during the benchmark.
- All IconButtons in SideBarView are disabled during the benchmark.
- Node selection in SkillTreeView is disabled during the benchmark.
This ensures that the user cannot interact with these components while a benchmark test is running, thereby improving UX and preventing potential issues.
This commit updates the `TaskQueueView` to pass real data from the `selectedNodeHierarchy` to the `callGenerateReport` method in `SkillTreeViewModel`. An array of test names is constructed from the reversed node hierarchy, and these names are used as the `tests` field in the `ReportRequestBody`.
Note: The `category` field is now an empty string as per the new requirement, and `mock` continues to be set to true.
This commit integrates the `callGenerateReport` method from `SkillTreeViewModel` into the `TaskQueueView`. Now, when the user clicks the green checkmark button, the `callGenerateReport` method is triggered with hardcoded values for testing purposes.
Note: The implementation is still temporary and will be updated for dynamic behavior in the future.
This commit adds a new boolean field, "mock", to the `ReportRequestBody` class. This additional field is in line with the new requirements to specify whether the report is a mock or not.
The `toJson()` method is also updated to include this new field during serialization.
This commit integrates Firebase Authentication into the application and refactors the `main.dart` file to streamline the user authentication flow. The application will now automatically switch between the main layout and the authentication UI based on the user's sign-in status.
- Added Firebase initialization to the `main()` function for Firebase Authentication.
- Replaced the home page in `MyApp` with a `StreamBuilder` that listens for Firebase auth state changes.
- Based on the auth state, the home page will either show `MainLayout` (if the user is signed in) or `FirebaseAuthView` (if the user is not signed in).
- Added necessary imports for the Firebase and authentication services.