* Clean up prompt generation
* Rename Performance Evaluations to Best Practices
* Move specification of response format from system prompt to Agent.construct_base_prompt
* Clean up PromptGenerator class
* Add debug logging to AIConfig autogeneration
* Clarify prompting and add support for multiple thought processes to Agent
* Rearrange tests into unit/integration/challenge categories
* Fix linting + `tests.challenges` imports
* Fix obscured duplicate test in test_url_validation.py
* Move VCR conftest to tests.vcr
* Specify tests to run & their order (unit -> integration -> challenges) in CI
* Fail Docker CI when tests fail
* Fix import & linting errors in tests
* Fix `get_text_summary`
* Fix linting errors
* Clean up pytest args in CI
* Remove bogus tests from GoCodeo