Files
Auto-GPT/.github/workflows/benchmark-ci.yml
Reinier van der Leer b106a61352 Clean up & fix GitHub workflows (#6313)
* ci: Mitigate security issues in autogpt-ci.yml

- Remove unnecessary pull_request_target paths and related variables and config
- Set permissions for contents to read only

* ci: Simplify steps in autogpt-ci.yml workflow using GitHub CLI

- Simplify step in 'autogpt-ci.yml' by using GitHub CLI instead of API for adding label and comment functionality
- Replace curl command with 'gh issue edit' to add "behaviour change" label to the pull request
- Replace gh api command with 'gh issue comment' to leave a comment about the changed behavior of AutoGPT in the pull request

* ci: Fix issues in workflows

- Move environment variable definition to top level in benchmark-ci.yml (because the other job also needs it)
- Removed invalid 'branches: [hackathon]' restriction in hackathon.yml workflow
- Removed redundant 'ref' and 'repository' fields in the 'checkout' step of both workflows.

* ci: Delete legacy benchmarks.yml workflow

* ci: Add triggers for CI workflows

- Add triggers to run CI workflows when they are edited.
- Update the paths for the CI workflows in the trigger configuration.

* fix: Fix benchmark lint error

- Removed unnecessary blank lines in report_types.py
- Fixed string quotes in challenge.py to maintain consistency

* fix: Update task description in password generator data.json

- Update task description in `data.json` file for the password generator challenge to clarify the input requirements and error handling.
- This change is made in an attempt to make the Benchmark CI pass.

* fix: Fix PasswordGenerator challenge in CI

- Fix the behavior of the reference password_generator.py to align with the task description
- Use default password length 8 instead of a random length in the generate_password function
- Retrieve the password length from the command line arguments if "--length" is provided, else set it to 8
2023-11-21 10:58:54 +01:00

141 lines
4.6 KiB
YAML

name: Benchmark CI
on:
push:
branches: [ master, development, ci-test* ]
paths:
- 'benchmark/**'
- .github/workflows/benchmark-ci.yml
- '!benchmark/reports/**'
pull_request:
branches: [ master, development, release-* ]
paths:
- 'benchmark/**'
- '!benchmark/reports/**'
- .github/workflows/benchmark-ci.yml
env:
min-python-version: '3.10'
jobs:
lint:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Set up Python ${{ env.min-python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ env.min-python-version }}
- id: get_date
name: Get date
working-directory: ./benchmark/
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT
- name: Install Poetry
working-directory: ./benchmark/
run: |
curl -sSL https://install.python-poetry.org | python -
- name: Install dependencies
working-directory: ./benchmark/
run: |
export POETRY_VIRTUALENVS_IN_PROJECT=true
poetry install -vvv
- name: Lint with flake8
working-directory: ./benchmark/
run: poetry run flake8
- name: Check black formatting
working-directory: ./benchmark/
run: poetry run black . --exclude test.py --check
if: success() || failure()
- name: Check isort formatting
working-directory: ./benchmark/
run: poetry run isort . --check
if: success() || failure()
- name: Check for unused imports and pass statements
working-directory: ./benchmark/
run: |
cmd="poetry run autoflake --remove-all-unused-imports --recursive --ignore-init-module-imports --ignore-pass-after-docstring agbenchmark"
$cmd --check || (echo "You have unused imports or pass statements, please run '${cmd} --in-place'" && exit 1)
if: success() || failure()
tests-agbenchmark:
runs-on: ubuntu-latest
strategy:
matrix:
agent-name: [ forge ]
fail-fast: false
timeout-minutes: 20
steps:
- name: Checkout repository
uses: actions/checkout@v3
with:
fetch-depth: 0
submodules: true
- name: Set up Python ${{ env.min-python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ env.min-python-version }}
- name: Install Poetry
working-directory: ./autogpts/${{ matrix.agent-name }}/
run: |
curl -sSL https://install.python-poetry.org | python -
- name: Run regression tests
run: |
./run agent start ${{ matrix.agent-name }}
sleep 10
cd autogpts/${{ matrix.agent-name }}
set +e # Ignore non-zero exit codes and continue execution
echo "Running the following command: poetry run agbenchmark --maintain --mock"
poetry run agbenchmark --maintain --mock
EXIT_CODE=$?
set -e # Stop ignoring non-zero exit codes
# Check if the exit code was 5, and if so, exit with 0 instead
if [ $EXIT_CODE -eq 5 ]; then
echo "regression_tests.json is empty."
fi
echo "Running the following command: poetry run agbenchmark --mock"
poetry run agbenchmark --mock
echo "Running the following command: poetry run agbenchmark --mock --category=data"
poetry run agbenchmark --mock --category=data
echo "Running the following command: poetry run agbenchmark --mock --category=coding"
poetry run agbenchmark --mock --category=coding
echo "Running the following command: poetry run agbenchmark --test=WriteFile"
poetry run agbenchmark --test=WriteFile
cd ../../benchmark
poetry install
echo "Adding the BUILD_SKILL_TREE environment variable. This will attempt to add new elements in the skill tree. If new elements are added, the CI fails because they should have been pushed"
export BUILD_SKILL_TREE=true
poetry run agbenchmark --mock
poetry run pytest -vv -s tests
CHANGED=$(git diff --name-only | grep -E '(agbenchmark/challenges)|(../frontend/assets)') || echo "No diffs"
if [ ! -z "$CHANGED" ]; then
echo "There are unstaged changes please run agbenchmark and commit those changes since they are needed."
echo "$CHANGED"
exit 1
else
echo "No unstaged changes."
fi
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}