Files
Auto-GPT/agbenchmark/challenges/README.md
2023-06-25 08:48:16 -04:00

48 lines
1.8 KiB
Markdown

# Challenges Data Schema of Benchmark
## General challenges
Input:
- **category** (str[]): Category of the challenge such as 'retrieval', 'comprehension', etc. _this is not currently used. for the future it may be needed_
- **task** (str): The task that the agent needs to solve.
- **dependencies** (str[]): The dependencies that the challenge needs to run. Needs to be the full node to the test function.
- **ground** (dict): The ground truth.
- **answer** (str): The raw text of the ground truth answer.
- **should_contain** (list): The exact strings that are required in the final answer.
- **should_not_contain** (list): The exact strings that should not be in the final answer.
- **files** (list): Files that are used for retrieval. Can specify file here or an extension.
- **mock_func** (str): Function to mock the agent's response. This is used for testing purposes.
- **info** (dict): Additional info about the challenge.
- **difficulty** (str): The difficulty of this query.
- **description** (str): Description of the challenge.
- **side_effects** (str[]): Describes the effects of the challenge.
Example:
```python
{
"category": ["basic"],
"task": "Write the string 'random string' before any existing text to the file called file_to_check.txt",
"dependencies": [
"agbenchmark/tests/basic_abilities/write_file/write_file_test.py::TestWriteFile::test_write_file"
],
"ground": {
"answer": "random string: this is how we're doing",
"should_contain": ["random string: this is how we're doing"],
"files": ["file_to_check.txt"]
},
"mock_func": "basic_read_file_mock",
"info": {
"description": "This reads the file quickly",
"difficulty": "basic",
"side_effects": [""]
}
}
```
Current Output:
- **score** (float): scores range from [0, 1]