How to Set Up an RAG Knowledge Base in OpenClaw
Connect your agents to your proprietary data with automated indexing and semantic retrieval. Enable Intelligence Mode on a Fastio workspace to let agents run RAG queries without managing external vector databases or complex ETL pipelines.
What Is an OpenClaw Knowledge Base?
An OpenClaw Knowledge Base is a Fastio workspace with Intelligence Mode enabled.
Traditional RAG (Retrieval-Augmented Generation) setups require you to chunk documents, embed them with a separate model, and store vectors in a database like Pinecone or Milvus. Fastio handles this entire pipeline natively.
When you upload files, such as PDFs, spreadsheets, or code, Fastio automatically indexes them for semantic search. The system manages the embeddings and vector storage invisibly in the background.
Your agents then use the ClawHub skill to access this data. Instead of matching simple keywords, agents can ask natural language questions like "What are the compliance requirements for the Q3 project?" and receive accurate answers with source citations. This turns your storage from a passive hard drive into active memory for your AI workforce.
Developers don't need to manage infrastructure. You don't need to build ETL pipelines to sync your files with a vector database. If a file is in the workspace, it's indexed and ready for retrieval.
Why Use Fastio for Agent Memory?
Building a knowledge base for agents often involves connecting multiple services: object storage (S3), an embedding model (OpenAI), and a vector database. Fastio combines these into one layer.
Native Intelligence Files are indexed immediately when you upload them. There is no lag between adding a document and your agent being able to read it. The search engine understands context, synonyms, and intent, allowing for better retrieval than keyword matching alone.
File Support You aren't limited to text files. The system processes many formats, including:
- Documents: PDF, DOCX, TXT, MD, RTF
- Data: CSV, JSON, XLS, XLSX
- Code: PY, JS, TS, HTML, CSS
- Media: Video and audio transcripts (automatically generated), with HLS streaming that is 50-60% faster than standard progressive downloads.
Ownership and Handoff Fastio lets you transfer ownership. An agent can create a workspace, build the knowledge base, organize the files, and then transfer the workspace to a human client or team member. The agent keeps admin access, but the human owner gets full control. This works well for agencies building "data rooms" or delivered projects.
Build Your Agent's Knowledge Base for Free
Get 50GB of storage and 5,000 monthly credits to power your OpenClaw agents. No credit card required.
Prerequisites for RAG Setup
Make sure you have the necessary components in place before you start. The setup is lightweight and requires no credit card.
- Fastio Agent Account: You need a dedicated agent account. Sign up for free at /storage-for-agents/. This account is separate from your personal user account and is optimized for API usage.
OpenClaw Environment: Your agents should be running in an OpenClaw environment, whether that's on your local machine, a server, or a cloud container. 3.
The ClawHub Skill: This connects your agent to the Fastio cloud.
The free agent tier includes 50GB of storage and 5,000 monthly credits, which is enough for most development and pilot production use cases. It also includes 5 workspaces and 50 shares, giving you plenty of room to keep data separate for different projects or clients.
Step 1: Create a Dedicated Workspace
First, create a place for your knowledge base. In Fastio, this is called a Workspace.
- Log in to your Fastio dashboard using your agent credentials.
- Click the "New Workspace" button in the sidebar.
- Give your workspace a descriptive name. Good examples include
company-knowledge-base,project-alpha-docs, orlegal-reference-library. - Go to the workspace settings tab.
- Toggle "Intelligence Mode" to ON.
Important: Turning Intelligence Mode on triggers the indexing pipeline. Without this, the workspace acts as normal cloud storage. Once enabled, any existing files will be queued for indexing, and new uploads will be processed in real-time.
Step 2: Install the ClawHub Skill
Your OpenClaw agent needs to communicate with Fastio. The ClawHub skill provides this through a set of specialized tools.
In your agent's terminal or configuration interface, run the installation command:
clawhub install dbalve/fast-io
The installation process will trigger an OAuth flow. Follow the prompts to authenticate your agent with your Fastio account.
Once installed, your agent gets access to 14 specialized tools via ClawHub (part of the 251 MCP tools available on the platform). These tools allow the agent to:
list_files: See what's in the workspace.search_files: Perform semantic RAG queries.read_file: Read the full content of specific documents.upload_file: Add new knowledge.create_folder: Organize data.
Check the installation by asking your agent a simple status question, such as "List the files in my new workspace."
Step 3: Add Your Data
Now you need to add data to your knowledge base. You have several ways to get data into the system, depending on where your files are.
Direct Upload You can upload files directly through the Fastio web interface. This is often the easiest way to bulk-upload a folder of PDFs or existing documentation.
Agent Upload
Your agent can upload files itself using the upload_file tool. This is useful if the agent is generating reports or code that need to be saved to long-term memory.
URL Import (Recommended for Cloud Migration) If your data is in Google Drive, Dropbox, Box, or OneDrive, you don't need to download it to your local machine first. Use the URL Import feature to pull files directly cloud-to-cloud. This is faster and saves your local bandwidth.
File Limits The system is built for large files. You can upload individual files up to 1GB in size. For very large datasets, organize them into folders to keep the workspace organized.
Step 4: Test and Optimize Retrieval
With data indexed, you can test the RAG capabilities. The goal is to make sure your agent can retrieve specific facts accurately.
Testing Queries Try asking natural language questions that require combining information from multiple documents:
- "Based on the
Q1-Report.pdfandQ2-Forecast.xlsx, is the project on track?" - "Summarize the vacation policy for remote employees."
- "Find the error code
ERR-505in the technical logs and explain the fix."
Verifying Citations
When the agent answers, it should provide citations. Fastio's search_files tool returns the specific text chunks along with the source file name. Verify that the agent is correctly attributing information to the right document.
Optimizing File Names
Semantic search is powerful, but file names still matter. A file named Policy_2025_Final_v2.pdf helps the model understand context better than doc_scan_001.pdf. Clear, descriptive filenames help the agent confirm it has found the correct source before it even reads the content.
Using Folders for Scoped Search
As your knowledge base grows, you may want to limit searches to specific topics. Organize your files into folders like /financials, /legal, and /technical.
Agents can then scope their searches to just one folder, reducing noise and improving accuracy. For example, "Search the /legal folder for liability clauses."
Advanced Configuration & Best Practices
To maintain a high-quality knowledge base over time, consider these advanced strategies.
Real-Time Updates via Webhooks Static knowledge bases go stale. Use Fastio Webhooks to notify your agent whenever a file is updated or a new file is uploaded. This allows your agent to re-read changed documents or alert the team to new information.
Managing Concurrency with File Locks If you have multiple agents working in the same workspace—perhaps one writing code and another reviewing it—use File Locks to prevent conflicts. An agent can "lock" a file while editing it, ensuring that no other agent (or human) overwrites their work during the process.
Security and Permissions Fastio allows for permissions at the workspace and folder level. You can invite other agents or humans as "Viewers" who can search the knowledge base but not modify it. This is key for keeping a "single source of truth" that is widely accessible but securely managed.
Monitoring Usage Keep an eye on your credit usage. The free tier's 5,000 monthly credits go a long way, but heavy RAG usage consumes credits for both storage and AI tokens. You can view your current usage in the dashboard to ensure you don't hit limits during critical operations.
Troubleshooting Common Issues
If your agent is having trouble finding information, check these common problems.
"I can't find that file." First, check if Intelligence Mode is actually enabled for the specific workspace. If it was turned on after files were uploaded, it might take a moment to index the backlog. Check the file details in the UI; if you see a "Summarize" button, it has been indexed.
"Permission Denied" Errors Make sure your agent has the correct role. If the agent needs to upload or edit files, it must have "Editor" or "Admin" permissions on the workspace. "Viewer" access is enough for search-only tasks.
Poor Search Results If the search results are irrelevant, look at your file formats. Scanned PDFs (images) without OCR layers cannot be read by the text extractor. Ensure all PDFs are text-searchable. Also, try rephrasing the query to be more specific—semantic search works best with full questions rather than short keywords.
Frequently Asked Questions
Do I need a separate vector database like Pinecone?
No. Fastio's Intelligence Mode handles all embedding, vector storage, and retrieval automatically within the workspace.
What is the file size limit for the knowledge base?
The free agent tier supports individual files up to 1GB in size, which is much larger than most vector databases allow.
How fast is the indexing process?
Indexing usually takes seconds for standard documents. Larger files may take slightly longer, but data is available for RAG almost immediately.
Can I use this with local LLMs like LLaMA?
Yes. Because OpenClaw connects via the standard MCP protocol, you can use any model (local or hosted) to query your Fastio data.
Is the knowledge base secure?
Yes. All data is encrypted at rest and in transit. You can also manage access via granular permission settings for both agents and humans.
What happens if I exceed the 50GB limit?
If you exceed the 50GB limit, you will need to upgrade to a paid plan or delete older files. The system will notify you before you reach capacity.
Can multiple agents share the same knowledge base?
Yes. You can invite multiple agents to the same workspace, allowing them to share a common memory and collaborate on the same files.
Related Resources
Build Your Agent's Knowledge Base for Free
Get 50GB of storage and 5,000 monthly credits to power your OpenClaw agents. No credit card required.