* Clean up prompt generation
* Rename Performance Evaluations to Best Practices
* Move specification of response format from system prompt to Agent.construct_base_prompt
* Clean up PromptGenerator class
* Add debug logging to AIConfig autogeneration
* Clarify prompting and add support for multiple thought processes to Agent
* Correct and clean up JSON handling
* Use ast for message history too
* Lint
* Add comments explaining why we use literal_eval
* Add descriptions to llm_response_format schema
* Parse responses in code blocks
* Be more careful when parsing in code blocks
* Lint