mirror of
https://github.com/aljazceru/Auto-GPT.git
synced 2025-12-26 10:24:30 +01:00
Challenges Data Schema of Benchmark
General challenges
Input:
- category (str): information-retrieval
- difficulty_level(str): the difficulty of this query. choices from ["easy", "medium", "hard"]
Information-retrieval challenges
Input:
- category (str): information-retrieval
- query (str): the question need to be solve.
- ground (dict): The ground truth.
- answer (str): The raw text of ground truth answer
- should_contain (list): the exact strings that is required in the final answer
- should_not_contain (list): the exact strings that should not be in the final answer
- difficulty_level(str): the difficulty of this query. choices from ["easy", "medium", "hard"]
Example:
{
"category": "information-retrieval",
"query": "what is the capital of America",
"ground": {
"answer": "Washington",
"should_contain": ["Washington"],
"should_not_contain": ["New York", "Los Angeles", "San Francisco"]
},
"difficulty_level": "easy"
}
Output:
- score (float): scores range from [0, 1]