Auto-GPT/data/README.md

# Challenges Data Schema of Benchmark

## General challenges
Input:
- **category** (str): information-retrieval
- **difficulty_level**(str): the difficulty of this query. choices from ["easy", "medium", "hard"]


## Information-retrieval challenges
Input:
- **category** (str): information-retrieval
- **query** (str): the question need to be solve.
- **ground** (dict): The ground truth.
    - **answer** (str): The raw text of ground truth answer
    - **should_contain** (list): the exact strings that is required in the final answer
    - **should_not_contain** (list): the exact strings that should not be in the final answer
- **difficulty_level**(str): the difficulty of this query. choices from ["easy", "medium", "hard"]

Example:
```python
{
    "category": "information-retrieval",
    "query": "what is the capital of America",
    "ground": {
        "answer": "Washington",
        "should_contain": ["Washington"],
        "should_not_contain": ["New York", "Los Angeles", "San Francisco"]
    },
    "difficulty_level": "easy"
}
```


Output:
- **score** (float): scores range from [0, 1]