From f00ced6612896c0489eb83017777bc3e3652cc33 Mon Sep 17 00:00:00 2001
From: Douglas Schonholtz <15002691+dschonholtz@users.noreply.github.com>
Date: Tue, 18 Apr 2023 11:59:42 -0400
Subject: [PATCH] Update README.md

---
 README.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 26aec3d3..52eb9445 100644
--- a/README.md
+++ b/README.md
@@ -9,9 +9,8 @@ A set of standardised benchmarks to assess the performance of Auto-GPTs.
 - [ ] Lower priority, but put this in a webserver backend so we have a good API rather than doing container and file management for our interface between evals and our agent.
 - [ ] Get token counting data from the model Add scores to result files based on pricing associated with tokens and models used
 - [ ] Think about how this can be applied to other projects besides AutoGPT so we can be THE agent evaluation framework.
-- [ ] Figure our how the OpenAI Evals results are saved...
+- [ ] Copy the OpenAI Eval files from the tmp file they are saved to somewhere we can track the results
 - [ ] Support multi-threaded evals. OpenAI has great support for this. The docker system built here doesn't.
-- [ ] Make the file logger/duplicate op checker more robust. It's not great right now.
 
 
 ## Understanding OpenAI Evals