mirror of
https://github.com/aljazceru/Auto-GPT.git
synced 2025-12-24 09:24:27 +01:00
Update README.md
This commit is contained in:
18
README.md
18
README.md
@@ -333,6 +333,7 @@ To switch to either, change the `MEMORY_BACKEND` env variable to the value that
|
||||
|
||||
|
||||
## 🧠 Memory pre-seeding
|
||||
Memory pre-seeding allows you to ingest files into memory and pre-seed it before running Auto-GPT.
|
||||
|
||||
```bash
|
||||
# python data_ingestion.py -h
|
||||
@@ -348,19 +349,15 @@ options:
|
||||
--overlap OVERLAP The overlap size between chunks when ingesting files (default: 200)
|
||||
--max_length MAX_LENGTH The max_length of each chunk when ingesting files (default: 4000
|
||||
|
||||
# python data_ingestion.py --dir <seed_data> --init --overlap 200 --max_length 1000
|
||||
# python data_ingestion.py --dir DataFolder --init --overlap 100 --max_length 2000
|
||||
```
|
||||
In the example above, the script initializes the memory, ingests all files within the `Auto-Gpt/autogpt/auto_gpt_workspace/DataFolder` directory into memory with an overlap between chunks of 100 and a maximum length of each chunk of 2000.
|
||||
|
||||
This script located at `data_ingestion.py`, allows you to ingest files into memory and pre-seed it before running Auto-GPT.
|
||||
Note that you can also use the `--file` argument to ingest a single file into memory and that data_ingestion.py will only ingest files within the `/auto_gpt_workspace` directory.
|
||||
|
||||
Memory pre-seeding is a technique that involves ingesting relevant documents or data into the AI's memory so that it can use this information to generate more informed and accurate responses.
|
||||
The DIR path is relative to the auto_gpt_workspace directory, so `python data_ingestion.py --dir . --init` will ingest everything in `auto_gpt_workspace` directory.
|
||||
|
||||
To pre-seed the memory, the content of each document is split into chunks of a specified maximum length with a specified overlap between chunks, and then each chunk is added to the memory backend set in the .env file. When the AI is prompted to recall information, it can then access those pre-seeded memories to generate more informed and accurate responses.
|
||||
|
||||
This technique is particularly useful when working with large amounts of data or when there is specific information that the AI needs to be able to access quickly.
|
||||
By pre-seeding the memory, the AI can retrieve and use this information more efficiently, saving time, API call and improving the accuracy of its responses.
|
||||
|
||||
You could for example download the documentation of an API, a GitHub repository, etc. and ingest it into memory before running Auto-GPT.
|
||||
Memory pre-seeding is a technique for improving AI accuracy by ingesting relevant data into its memory. Chunks of data are split and added to memory, allowing the AI to access them quickly and generate more accurate responses. It's useful for large datasets or when specific information needs to be accessed quickly. Examples include ingesting API or GitHub documentation before running Auto-GPT.
|
||||
|
||||
⚠️ If you use Redis as your memory, make sure to run Auto-GPT with the `WIPE_REDIS_ON_START` set to `False` in your `.env` file.
|
||||
|
||||
@@ -368,9 +365,6 @@ You could for example download the documentation of an API, a GitHub repository,
|
||||
|
||||
Memories will be available to the AI immediately as they are ingested, even if ingested while Auto-GPT is running.
|
||||
|
||||
In the example above, the script initializes the memory, ingests all files within the `<seed_data>` directory into memory with an overlap between chunks of 200 and a maximum length of each chunk of 4000.
|
||||
Note that you can also use the `--file` argument to ingest a single file into memory and that the script will only ingest files within the `/auto_gpt_workspace` directory.
|
||||
|
||||
You can adjust the `max_length` and overlap parameters to fine-tune the way the docuents are presented to the AI when it "recall" that memory:
|
||||
|
||||
- Adjusting the overlap value allows the AI to access more contextual information from each chunk when recalling information, but will result in more chunks being created and therefore increase memory backend usage and OpenAI API requests.
|
||||
|
||||
Reference in New Issue
Block a user