How to Use AutoGen Memory: Managing State in Microsoft's Agent Framework
AutoGen memory lets agents remember conversation history and learned facts across runs, creating "teachable" agents. By default, Large Language Models (LLMs) are stateless. They forget everything once a session ends. AutoGen solves this with the `TeachableAgent` class and vector database integration, so agents can recall user preferences, past decisions, and specific instructions indefinitely.
What Is AutoGen Memory?
AutoGen memory lets agents retain information beyond a single interaction. Standard agents rely on the limited context window of an LLM. AutoGen adds persistent storage layers that act as long-term memory. The framework handles two types of memory:
- Short-Term Memory: The immediate conversation history (chat context) maintained in the
ListMemorybuffer. - Long-Term Memory: Persistent facts and user learnings stored in a vector database via the
TeachableAgentcapability. Microsoft Research'sTeachableAgentclass lets users teach agents new skills and facts that persist across chat sessions, connecting temporary context with permanent knowledge. This matters when building personalized AI assistants that grow with the user. Without persistent memory, every session starts from scratch. Users have to repeat basic instructions and preferences, which makes complex applications frustrating to use.
Helpful references: Fast.io Workspaces, Fast.io Collaboration, and Fast.io AI.
How to Set Up a Teachable Agent
To give an AutoGen agent persistent memory, you need to use the TeachableAgent class. This wrapper adds a memory store (typically a vector database like ChromaDB) to the standard ConversableAgent.
1. Install the required dependencies
First, ensure you have the teachable features installed:
pip install "pyautogen[teachable]"
2. Initialize the Teachable Agent
Here is a basic implementation that creates an agent capable of remembering user instructions:
from autogen import UserProxyAgent, config_list_from_json
from autogen.agentchat.contrib.teachable_agent import TeachableAgent
### Load your LLM config
config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST")
### Create the teachable agent
memory_agent = TeachableAgent(
name="memory_assistant",
llm_config={"config_list": config_list},
teach_config={
"reset_db": False # Keep memory between runs (uses defaults for others)
}
)
### Create a user proxy to interact with it
user = UserProxyAgent("user", human_input_mode="ALWAYS")
### Start chatting
user.initiate_chat(memory_agent, message="My favorite color is blue. Please remember that.")
When you run this script again and ask "What is my favorite color?", the agent will query its local vector database, retrieve the relevant fact, and answer correctly.
Managing State with Vector Databases
AutoGen's memory uses vector databases to store text embeddings. By default, it runs a local instance of ChromaDB. This works for local development but causes problems for production deployments or team collaboration.
The Challenge with Local Memory
When you run an AutoGen agent locally, its "brain" (the .chroma folder) lives on your hard drive. Move your code to a server or share it with a colleague, and the agent loses its memory. This makes it hard to build persistent, multi-user agent systems. Local storage also lacks the redundancy and accessibility you need for production applications. If the local machine fails or someone deletes the directory, all the gathered intelligence and historical context are gone. That's a real data risk.
Using Remote Storage for Agent State
To solve this, developers separate the compute (the agent script) from the storage (the memory). You can configure AutoGen to use a remote vector store or sync the local database directory to cloud storage. For file-based memory (like JSON logs or chat histories), a cloud storage solution like Fast.io lets multiple agents read and write to the same shared directory. This gives your agents a shared "hive mind" where one agent can learn a fact and write it to a file that others can access.
Give Your Agents a Cloud Brain for autogen memory
Stop relying on local files. Connect your AutoGen agents to Fast.io's MCP server for 50GB of free, persistent cloud storage.
Sharing Memory Between Agents
In complex workflows, you often need multiple agents to share context. For example, a "Research Agent" might gather data that a "Writing Agent" needs to access later.
Method 1: Shared Context Injection You can explicitly pass the conversation history from one chat to another:
### Agent A chats with User
chat_result = user.initiate_chat(agent_a, message="Find info on X")
### Pass summary to Agent B
user.initiate_chat(agent_b, message=f"Here is what Agent A found: {chat_result.summary}")
Method 2: Shared Cloud Storage (Fast.io MCP) For a solution that scales, agents can use the Model Context Protocol (MCP) to interact with a shared file system. Connect your AutoGen agents to Fast.io's MCP server, and they can:
- Read/Write Shared Memory: Store long-term facts in structured JSON files in the cloud.
- Access Project Context: Retrieve documentation or guidelines uploaded by humans.
- Persist State: Save their internal state to a location that survives restarts. Fast.io offers a free agent tier with 50GB of storage and a dedicated MCP server. This gives you a backend for agent memory without running your own database server.
Best Practices for RAG Memory
Retrieval-Augmented Generation (RAG) powers AutoGen's memory. To make sure your agents recall the right information, follow these strategies:
- Chunking Strategy: Break long documents into smaller segments (300-500 tokens). This keeps the retrieved context precise and prevents overflow of the LLM's context window.
- Metadata Filtering: Tag memories with metadata (e.g.,
user_id,topic,date). This lets the agent filter searches, so it doesn't confuse User A's preferences with User B's. - Hybrid Search: Combine keyword search (BM25) with vector search. Vector search works well for concepts, but keyword search is better for specific identifiers like part numbers or names.
- Regular Pruning: Agent memory gets cluttered, just like human memory. Set up a process to archive or delete outdated memories. This keeps retrieval accurate and costs down.
Frequently Asked Questions
Does AutoGen have long-term memory?
Yes, AutoGen supports long-term memory through the `TeachableAgent` class. It uses a vector database (like ChromaDB) to store text embeddings of user interactions, allowing the agent to recall facts and instructions across different sessions.
How do I save AutoGen state?
You can save AutoGen state by configuring a persistent directory for the vector database in the `teach_config` dictionary. Also, you can serialize the agent's conversation history to a JSON file and reload it when initializing the agent in a new session.
Can multiple AutoGen agents share the same memory?
Yes, but it requires configuration. Agents can share a read-only vector store, or use a shared cloud file system (via Fast.io MCP) to read and write common memory files. For real-time shared memory, a centralized vector database instance (like Pinecone or Weaviate) is recommended over local files.
Related Resources
Give Your Agents a Cloud Brain for autogen memory
Stop relying on local files. Connect your AutoGen agents to Fast.io's MCP server for 50GB of free, persistent cloud storage.