Auto-GPT

mirror of https://github.com/aljazceru/Auto-GPT.git synced 2026-02-01 12:24:28 +01:00

Author	SHA1	Message	Date
Ethan Presberg	6cfe229332	feat(frontend): Allow sending a message with the enter key (#6378 ) This has not yet been tested due to an issue with compiling on WSL. This was the fix suggested by Pwuts.	2024-02-20 10:49:37 +01:00
Reinier van der Leer	1079d71699	fix(ci/benchmark): Unbreak "Push reports to data branch" step The `report_subfolder` variable was being populated with two identical lines, because there will be two untracked files in the folder, resulting in the same dirname. This caused later commands using that variable to fail. Fix is to `sort -u` before storing the value to `report_subfolder`.	2024-02-20 10:35:14 +01:00
Reinier van der Leer	e104427767	feat(ci/benchmark): Generate step summary from benchmark report	2024-02-19 17:13:41 +01:00
Reinier van der Leer	bfd479a50b	feat(benchmark): Add reports/format.py script to convert report.json to markdown	2024-02-19 17:13:05 +01:00
Reinier van der Leer	fb63bf4425	chore: Update `agbenchmark` dependency for agent and forge	2024-02-19 17:11:19 +01:00
Reinier van der Leer	3a17011129	feat(benchmark): Include Steps in Report	2024-02-19 17:08:24 +01:00
Reinier van der Leer	c339c6b54f	chore: Update `agbenchmark` dependency for agent and forge	2024-02-18 17:37:03 +01:00
Reinier van der Leer	7f71d6d9fd	debug(benchmark): Improve `TestResult` validation error output format	2024-02-18 17:10:14 +01:00
Reinier van der Leer	784e2bbb1c	fix(ci/benchmark): Mitigate VCS conflicts with files in data branch `agbenchmark` currently creates files like success_rate.json in the base REPORTS_FOLDER, which causes conflicts in the last step of the benchmark workflow. To prevent issues, these files must be removed prior to switching to the data branch.	2024-02-17 18:09:44 +01:00
Reinier van der Leer	959377f54c	fix(ci/benchmark): Add `set +e` because we expect (some) challenges to fail	2024-02-17 15:56:55 +01:00
Reinier van der Leer	6bc83e925c	chore: Update `agbenchmark` dependency for agent and forge	2024-02-17 15:56:33 +01:00
Reinier van der Leer	4ede773f5a	debug(benchmark): Add more debug code to pinpoint cause of rare crash Target: https://github.com/Significant-Gravitas/AutoGPT/actions/runs/7941977633/job/21684817491	2024-02-17 15:48:57 +01:00
Reinier van der Leer	d5ad719757	ci: Allow telemetry for non-push events, as long as it's on `master` Also disable telemetry for AutoGPT's unit/integration tests.	2024-02-17 15:12:43 +01:00
Reinier van der Leer	1ca9b9fa93	ci: Fix setting/passing `TELEMETRY_*` environment variables	2024-02-17 14:26:03 +01:00
Reinier van der Leer	15024fb5a1	chore: Update `agbenchmark` dependency for agent and forge	2024-02-17 14:18:02 +01:00
Reinier van der Leer	fa4bdef17c	ci: Update actions to newest versions - `actions/stale` -> `v9` - `actions/cache` -> `v4` - `actions/checkout` -> `v4` - `actions/setup-node` -> `v4` - `docker/login-action` -> `v3` - `actions/setup-python` -> `v5` - `codecov/codecov-action` -> `v4` - `actions/upload-artifact` -> `v4` - `subosito/flutter-action` -> `v2` - `docker/build-push-action` -> `v5` - `docker/setup-buildx-action` -> `v3`	2024-02-17 13:59:13 +01:00
Reinier van der Leer	e2b519ef3b	debug(benchmark): Make sure `TestResult` validator error output is sufficient to debug	2024-02-17 13:36:17 +01:00
Reinier van der Leer	09c307d679	debug(benchmark): Add log statement to validator on `TestResult` Validation errors don't mention the values causing the error, making it hard to debug. This happened a few times in autogpts-benchmark.yml, so let's put this log statement here until we figure out what makes it crash.	2024-02-17 13:32:22 +01:00
Reinier van der Leer	880c8e804c	fix(ci/benchmark): Allow workflow to continue regardless of challenge outcomes	2024-02-17 11:52:26 +01:00
Reinier van der Leer	5f0764b65c	chore: Update agbenchmark dependency for agent and forge	2024-02-16 19:07:37 +01:00
Reinier van der Leer	63e6014b27	fix(benchmark): Fix `TestResult.fail_reason` assignment condition The condition must be the same as for `success`, because otherwise it causes a crash when `call.excinfo` evaluates to `False` but is not `None`.	2024-02-16 19:05:00 +01:00
Reinier van der Leer	83fcd9ad16	chore: Update `agbenchmark` dependency for agent and forge	2024-02-16 18:44:58 +01:00
Reinier van der Leer	f9792ed7f3	fix(benchmark): Unbreak `-N`/`--attempts` option	2024-02-16 18:43:37 +01:00
Reinier van der Leer	d6ab470c58	Rename autogpts-benchmark-nightly.yml to autogpts-benchmark.yml	2024-02-16 18:32:50 +01:00
Reinier van der Leer	666a5a8777	feat(agent/serve): Report task cost through `Step.additional_output` - Added `task_cumulative_cost` and `task_total_cost` attributes to the `Step.additional_output` in the `AgentProtocolServer.execute_step` endpoint. - Updated `agbenchmark` dependency in Agent and Forge	2024-02-16 18:19:04 +01:00
Reinier van der Leer	21f1e64559	feat(benchmark): Get agent task cost from `Step.additional_output`	2024-02-16 18:10:46 +01:00
Reinier van der Leer	752bac099b	feat(benchmark/report): Add and record `TestResult.n_steps` - Added `n_steps` attribute to `TestResult` type - Added logic to record the number of steps to `BuiltinChallenge.test_method`, `WebArenaChallenge.test_method`, and `.reports.add_test_result_to_report`	2024-02-16 17:53:19 +01:00
Reinier van der Leer	a5de79beb6	ci(benchmark): Add nightly benchmark workflow Added autogpts-benchmark-nightly.yml, which will run every night at 02:00 UTC with a selection of challenges.	2024-02-16 17:41:58 +01:00
Reinier van der Leer	483c01b681	lint(benchmark): Remove unnecessary `pass` statement in __main__.py	2024-02-16 17:27:56 +01:00
Reinier van der Leer	992b8874fc	chore: Update `agbenchmark` dependency for agent and forge	2024-02-16 17:22:58 +01:00
Reinier van der Leer	2a55efb322	fix(benchmark): Include `WebArenaSiteInfo.additional_info` (e.g. credentials) in task input Without the `additional_info`, it is impossible to get past the login page on challenges where that is necessary.	2024-02-16 17:20:44 +01:00
Reinier van der Leer	23d58a3cc0	feat(benchmark/cli): Add `challenge list`, `challenge info` subcommands - Add `challenge list` command with options `--all`, `--names`, `--json` - Add `tabular` dependency - Add `.utils.utils.sorted_by_enum_index` function to easily sort lists by an enum value/property based on the order of the enum's definition - Add `challenge info [name]` command with option `--json` - Add `.utils.utils.pretty_print_model` routine to pretty-print Pydantic models - Refactor `config` subcommand to use `pretty_print_model`	2024-02-16 15:17:11 +01:00
Reinier van der Leer	70e345b2ce	refactor(benchmark): `load_webarena_challenges` - Reduce duplicate and nested statements - Add `skip_unavailable` parameter Related changes: - Add `available` and `unavailable_reason` attributes to `ChallengeInfo` and `WebArenaChallengeSpec` - Add `pytest.skip` statement to `WebArenaChallenge.test_method` to make sure unavailable challenges are not run	2024-02-16 15:11:48 +01:00
Reinier van der Leer	650a701317	chore: Update `agbenchmark` dependency for agent and forge	2024-02-15 18:19:06 +01:00
Reinier van der Leer	679339d00c	feat(benchmark): Make report output folder configurable - Make `AgentBenchmarkConfig.reports_folder` directly configurable (through `REPORTS_FOLDER` env variable). The default is still `./agbenchmark_config/reports`. - Change all mentions of `REPORT_LOCATION` (which fulfilled the same function at some point in the past) to `REPORTS_FOLDER`.	2024-02-15 18:07:45 +01:00
Reinier van der Leer	fd5730b04a	feat(agent/telemetry): Distinguish between `production` and `dev` environment based on VCS state - Added a helper function `.app.utils.vcs_state_diverges_from_master()`. This function determines whether the relevant part of the codebase diverges from our `master`. - Updated `.app.telemetry._setup_sentry()` to determine the default environment name using `vcs_state_diverges_from_master`.	2024-02-15 16:00:30 +01:00
Reinier van der Leer	b7f08cd0f7	feat(agent/telemetry): Enable performance tracing & update opt-in prompt accordingly	2024-02-15 14:46:36 +01:00
Reinier van der Leer	8762f7ab3d	fix(forge): Make `watchfiles` pattern more specific to prevent unwanted (breaking) reloads This fixes the issue of changes in artifacts triggering an application reload (which caused connection errors for in-progress requests).	2024-02-15 13:42:38 +01:00
Reinier van der Leer	a9b7b175ff	fix(agent/profile_generator): Improve robustness by leveraging `create_chat_completion`'s parse handling	2024-02-15 11:48:07 +01:00
Reinier van der Leer	52b93dd84e	fix(cli/agent start): Wait for applications to finish starting before returning - Added a helper function `wait_until_conn_ready(port)` to wait for the benchmark and agent applications to finish starting - Improved the CLI's own logging (within the `agent start` command)	2024-02-15 11:26:26 +01:00
Reinier van der Leer	6a09a44ef7	lint(agent): Fix telemetry.py linting error & formatting	2024-02-14 23:31:35 +01:00
Toran Bruce Richards	32a627eda9	Add Privacy Policy link to telementry opt-in.	2024-02-14 16:42:34 +00:00
Reinier van der Leer	67bafa6302	fix(autogpt/llm): `AssistantChatMessage.tool_calls` default `[]` instead of `None` OpenAI ChatCompletion calls fail when `tool_calls = None`. This issue came to light after `22aba6d`.	2024-02-14 14:34:04 +01:00
Reinier van der Leer	6017eefb32	ci: Enable telemetry in CI runs on `master`	2024-02-14 12:03:54 +01:00
Reinier van der Leer	ae197fc85f	feat(agent/telemetry): Distinguish between users This allows us to get a much better sense of how many users actually experience issues, and how issue occurrence is distributed among users.	2024-02-14 11:50:45 +01:00
Reinier van der Leer	22aba6dd8a	fix(agent/llm): Include bad response in parse-fix prompt in `OpenAIProvider.create_chat_completion` Apparently I forgot to also append the response that caused the parse error before throwing it back to the LLM and letting it fix its mistake(s).	2024-02-14 11:20:31 +01:00
Reinier van der Leer	88bbdfc7fc	ci: Pick 3 challenges to run with `--mock` in smoke test CI	2024-02-14 02:30:03 +01:00
Reinier van der Leer	d0c9b7c405	lint(benchmark): Remove unused imports	2024-02-14 01:34:30 +01:00
Reinier van der Leer	e7698a4610	chore(agent): Update `forge` and `agbenchmark` dependencies	2024-02-14 01:32:28 +01:00
Reinier van der Leer	ab05b7ae70	chore(forge): Update `agbenchmark` dependency	2024-02-14 01:27:07 +01:00

1 2 3 4 5 ...

5154 Commits