enable milvus as memory backend

This commit is contained in:
chyezh
2023-04-11 19:36:41 +08:00
parent 1073954fb7
commit 395d9d0481
6 changed files with 145 additions and 9 deletions

View File

@@ -77,6 +77,13 @@ REDIS_PASSWORD=
WIPE_REDIS_ON_START=False WIPE_REDIS_ON_START=False
MEMORY_INDEX=auto-gpt MEMORY_INDEX=auto-gpt
### MILVUS
# MILVUS_ADDR - Milvus remote address (e.g. localhost:19530)
# MILVUS_COLLECTION - Milvus collection,
# change it if you want to start a new memory and retain the old memory.
MILVUS_ADDR=your-milvus-cluster-host-port
MILVUS_COLLECTION=autogpt
################################################################################ ################################################################################
### IMAGE GENERATION PROVIDER ### IMAGE GENERATION PROVIDER
################################################################################ ################################################################################

View File

@@ -35,22 +35,27 @@ Your support is greatly appreciated
## Table of Contents ## Table of Contents
- [Auto-GPT: An Autonomous GPT-4 Experiment](#auto-gpt-an-autonomous-gpt-4-experiment) - [Auto-GPT: An Autonomous GPT-4 Experiment](#auto-gpt-an-autonomous-gpt-4-experiment)
- [Demo (30/03/2023):](#demo-30032023) - [🔴 🔴 🔴 Urgent: USE `stable` not `master` 🔴 🔴 🔴](#----urgent-use-stable-not-master----)
- [Demo (30/03/2023):](#demo-30032023)
- [Table of Contents](#table-of-contents) - [Table of Contents](#table-of-contents)
- [🚀 Features](#-features) - [🚀 Features](#-features)
- [📋 Requirements](#-requirements) - [📋 Requirements](#-requirements)
- [💾 Installation](#-installation) - [💾 Installation](#-installation)
- [🔧 Usage](#-usage) - [🔧 Usage](#-usage)
- [Logs](#logs) - [Logs](#logs)
- [Docker](#docker)
- [Command Line Arguments](#command-line-arguments)
- [🗣️ Speech Mode](#-speech-mode) - [🗣️ Speech Mode](#-speech-mode)
- [🔍 Google API Keys Configuration](#-google-api-keys-configuration) - [🔍 Google API Keys Configuration](#-google-api-keys-configuration)
- [Setting up environment variables](#setting-up-environment-variables) - [Setting up environment variables](#setting-up-environment-variables)
- [Redis Setup](#redis-setup) - [Memory Backend Setup](#memory-backend-setup)
- [🌲 Pinecone API Key Setup](#-pinecone-api-key-setup) - [Redis Setup](#redis-setup)
- [🌲 Pinecone API Key Setup](#-pinecone-api-key-setup)
- [Milvus Setup](#milvus-setup)
- [Setting up environment variables](#setting-up-environment-variables-1) - [Setting up environment variables](#setting-up-environment-variables-1)
- [Setting Your Cache Type](#setting-your-cache-type) - [Setting Your Cache Type](#setting-your-cache-type)
- [View Memory Usage](#view-memory-usage) - [View Memory Usage](#view-memory-usage)
- [🧠 Memory pre-seeding](#memory-pre-seeding) - [🧠 Memory pre-seeding](#-memory-pre-seeding)
- [💀 Continuous Mode ⚠️](#-continuous-mode-) - [💀 Continuous Mode ⚠️](#-continuous-mode-)
- [GPT3.5 ONLY Mode](#gpt35-only-mode) - [GPT3.5 ONLY Mode](#gpt35-only-mode)
- [🖼 Image Generation](#-image-generation) - [🖼 Image Generation](#-image-generation)
@@ -75,10 +80,11 @@ Your support is greatly appreciated
- [Python 3.8 or later](https://www.tutorialspoint.com/how-to-install-python-in-windows) - [Python 3.8 or later](https://www.tutorialspoint.com/how-to-install-python-in-windows)
- [OpenAI API key](https://platform.openai.com/account/api-keys) - [OpenAI API key](https://platform.openai.com/account/api-keys)
Optional: Optional:
- [PINECONE API key](https://www.pinecone.io/) (If you want Pinecone backed memory) - Memory backend
- [PINECONE API key](https://www.pinecone.io/) (If you want Pinecone backed memory)
- [Milvus](https://milvus.io/) (If you want Milvus as memory backend)
- ElevenLabs Key (If you want the AI to speak) - ElevenLabs Key (If you want the AI to speak)
## 💾 Installation ## 💾 Installation
@@ -209,7 +215,11 @@ export CUSTOM_SEARCH_ENGINE_ID="YOUR_CUSTOM_SEARCH_ENGINE_ID"
``` ```
## Redis Setup ## Memory Backend Setup
Setup any one backend to persist memory.
### Redis Setup
Install docker desktop. Install docker desktop.
@@ -246,7 +256,7 @@ You can specify the memory index for redis using the following:
MEMORY_INDEX=whatever MEMORY_INDEX=whatever
``` ```
## 🌲 Pinecone API Key Setup ### 🌲 Pinecone API Key Setup
Pinecone enables the storage of vast amounts of vector-based memory, allowing for only relevant memories to be loaded for the agent at any given time. Pinecone enables the storage of vast amounts of vector-based memory, allowing for only relevant memories to be loaded for the agent at any given time.
@@ -254,6 +264,18 @@ Pinecone enables the storage of vast amounts of vector-based memory, allowing fo
2. Choose the `Starter` plan to avoid being charged. 2. Choose the `Starter` plan to avoid being charged.
3. Find your API key and region under the default project in the left sidebar. 3. Find your API key and region under the default project in the left sidebar.
### Milvus Setup
[Milvus](https://milvus.io/) is a open-source, high scalable vector database to storage huge amount of vector-based memory and provide fast relevant search.
- setup milvus database, keep your pymilvus version and milvus version same to avoid compatible issues.
- setup by open source [Install Milvus](https://milvus.io/docs/install_standalone-operator.md)
- or setup by [Zilliz Cloud](https://zilliz.com/cloud)
- set `MILVUS_ADDR` in `.env` to your milvus address `host:ip`.
- set `MEMORY_BACKEND` in `.env` to `milvus` to enable milvus as backend.
- optional
- set `MILVUS_COLLECTION` in `.env` to change milvus collection name as you want, `autogpt` is the default name.
### Setting up environment variables ### Setting up environment variables
In the `.env` file set: In the `.env` file set:

View File

@@ -62,6 +62,10 @@ class Config(metaclass=Singleton):
self.pinecone_api_key = os.getenv("PINECONE_API_KEY") self.pinecone_api_key = os.getenv("PINECONE_API_KEY")
self.pinecone_region = os.getenv("PINECONE_ENV") self.pinecone_region = os.getenv("PINECONE_ENV")
# milvus configuration, e.g., localhost:19530.
self.milvus_addr = os.getenv("MILVUS_ADDR", "localhost:19530")
self.milvus_collection = os.getenv("MILVUS_COLLECTION", "autogpt")
self.image_provider = os.getenv("IMAGE_PROVIDER") self.image_provider = os.getenv("IMAGE_PROVIDER")
self.huggingface_api_token = os.getenv("HUGGINGFACE_API_TOKEN") self.huggingface_api_token = os.getenv("HUGGINGFACE_API_TOKEN")

View File

@@ -21,6 +21,12 @@ except ImportError:
print("Pinecone not installed. Skipping import.") print("Pinecone not installed. Skipping import.")
PineconeMemory = None PineconeMemory = None
try:
from memory.milvus import MilvusMemory
except ImportError:
print("pymilvus not installed. Skipping import.")
MilvusMemory = None
def get_memory(cfg, init=False): def get_memory(cfg, init=False):
memory = None memory = None
@@ -44,6 +50,12 @@ def get_memory(cfg, init=False):
memory = RedisMemory(cfg) memory = RedisMemory(cfg)
elif cfg.memory_backend == "no_memory": elif cfg.memory_backend == "no_memory":
memory = NoMemory(cfg) memory = NoMemory(cfg)
elif cfg.memory_backend == "milvus":
if not MilvusMemory:
print("Error: Milvus sdk is not installed."
"Please install pymilvus to use Milvus as memory backend.")
else:
memory = MilvusMemory(cfg)
if memory is None: if memory is None:
memory = LocalCache(cfg) memory = LocalCache(cfg)
@@ -56,4 +68,4 @@ def get_supported_memory_backends():
return supported_memory return supported_memory
__all__ = ["get_memory", "LocalCache", "RedisMemory", "PineconeMemory", "NoMemory"] __all__ = ["get_memory", "LocalCache", "RedisMemory", "PineconeMemory", "NoMemory", "MilvusMemory"]

View File

@@ -12,6 +12,7 @@ docker
duckduckgo-search duckduckgo-search
google-api-python-client #(https://developers.google.com/custom-search/v1/overview) google-api-python-client #(https://developers.google.com/custom-search/v1/overview)
pinecone-client==2.2.1 pinecone-client==2.2.1
pymilvus==2.2.4
redis redis
orjson orjson
Pillow Pillow

90
scripts/memory/milvus.py Normal file
View File

@@ -0,0 +1,90 @@
from pymilvus import (
connections,
FieldSchema,
CollectionSchema,
DataType,
Collection,
)
from memory.base import MemoryProviderSingleton, get_ada_embedding
class MilvusMemory(MemoryProviderSingleton):
def __init__(self, cfg):
""" Construct a milvus memory storage connection.
Args:
cfg (Config): Auto-GPT global config.
"""
# connect to milvus server.
connections.connect(address=cfg.milvus_addr)
fields = [
FieldSchema(name="pk", dtype=DataType.INT64,
is_primary=True, auto_id=True),
FieldSchema(name="embeddings",
dtype=DataType.FLOAT_VECTOR, dim=1536),
FieldSchema(name="raw_text", dtype=DataType.VARCHAR,
max_length=65535)
]
# create collection if not exist and load it.
schema = CollectionSchema(fields, "auto-gpt memory storage")
self.collection = Collection(cfg.milvus_collection, schema)
# create index if not exist.
if not self.collection.has_index(index_name="embeddings"):
self.collection.release()
self.collection.create_index("embeddings", {
"index_type": "IVF_FLAT",
"metric_type": "IP",
"params": {"nlist": 128},
}, index_name="embeddings")
self.collection.load()
def add(self, data):
""" Add a embedding of data into memory.
Args:
data (str): The raw text to construct embedding index.
Returns:
str: log.
"""
embedding = get_ada_embedding(data)
result = self.collection.insert([[embedding], [data]])
_text = f"Inserting data into memory at primary key: {result.primary_keys[0]}:\n data: {data}"
return _text
def get(self, data):
""" Return the most relevant data in memory.
Args:
data: The data to compare to.
"""
return self.get_relevant(data, 1)
def clear(self):
""" Drop the index in memory.
"""
self.collection.drop()
return "Obliviated"
def get_relevant(self, data, num_relevant=5):
""" Return the top-k relevant data in memory.
Args:
data: The data to compare to.
num_relevant (int, optional): The max number of relevant data. Defaults to 5.
"""
# search the embedding and return the most relevant text.
embedding = get_ada_embedding(data)
search_params = {
"metrics_type": "IP",
"params": {"nprobe": 8},
}
result = self.collection.search(
[embedding], "embeddings", search_params, num_relevant, output_fields=["raw_text"])
return [item.entity.value_of_field("raw_text") for item in result[0]]
def get_stats(self):
"""
Returns: The stats of the milvus cache.
"""
return f"Entities num: {self.collection.num_entities}"