AI & Agents

How to Integrate Dify File Storage for Agents

Dify defaults to local file storage, which limits scalability and agent capabilities.

Fastio Editorial Team 6 min read
Production Dify apps require robust, persistent storage layers.

Why Dify Needs External Storage

By default, self-hosted Dify instances use local storage on the server. While this works for quick prototypes, it creates three critical problems for production applications that require high availability and reliability:

  • Data Persistence: If your container or server crashes without a volume mount, you lose uploaded user files and knowledge base assets. Local storage is ephemeral in most containerized environments, meaning a simple update to your Docker image could wipe out your entire knowledge base.
  • Scalability and Concurrency: Local disk space is finite and difficult to scale horizontally. In a multi-node setup, such as a Kubernetes cluster, local storage on one node is not accessible to another. If your API scales to multiple instances, they will lose track of each other's files, leading to "file not found" errors during workflow execution.
  • Performance Bottlenecks: As users upload thousands of documents for RAG (Retrieval-Augmented Generation), your server's I/O performance will degrade. Local disks are often the first bottleneck in heavy AI workloads, especially when processing large PDF batches. Dify solves the first two problems with native support for S3-compatible object storage. The third problem, agent delivery, requires a dedicated file platform like Fastio.

Helpful references: Fastio Workspaces, Fastio Collaboration, and Fastio AI.

Step 1: Configuring S3 for Dify Backend

To move your Dify system storage to the cloud, you need to modify your environment configuration. This handles the system files, including knowledge base uploads, temporary workflow assets, and logs. Using cloud storage ensures that your data remains safe even if the application servers are destroyed and recreated.

1. Prepare Your Bucket

Create a bucket in AWS S3, Cloudflare R2, or MinIO. Ensure you have an Access Key and Secret Key with PutObject, GetObject, and DeleteObject permissions. For production environments, it is highly recommended to use IAM roles or scoped access keys to limit the potential impact of a credential leak.

2. Configure CORS and Security

If your users will be uploading files directly from their browsers to the storage backend, you must configure Cross-Origin Resource Sharing (CORS). This tells the S3 bucket to allow requests from your Dify domain. A typical production policy should restrict AllowedOrigins to your specific URL and AllowedMethods to POST and PUT.

3. Update Environment Variables

Modify your .env file (or docker-compose.yaml) to switch the storage driver.

### General Storage Configuration
STORAGE_TYPE=s3

### AWS S3 / Compatible Service Settings
S3_ENDPOINT=https://s3.us-east-1.amazonaws.com
S3_BUCKET_NAME=your-dify-assets
S3_ACCESS_KEY=AKIA... S3_SECRET_KEY=wJal... S3_REGION=us-east-1

4. Migrate Existing Data

If you have existing files in local storage, use Dify's built-in migration commands inside your Docker container:

docker exec -it dify-api flask upload-local-files-to-cloud-storage
Screenshot of a configuration dashboard showing storage settings

The Missing Piece: Agent-to-Human Delivery

Configuring S3 solves the system's storage needs, but what about the agent's specific operational needs? This is a distinction that many developers overlook when first building Dify applications. System storage is for the "brains" of the AI, while operational storage is for its "hands."

When a Dify agent generates a report, creates an image, or processes a dataset, it often needs to deliver that result to a human user in a professional and accessible manner. S3 buckets are not designed for this type of interaction; they are secure, backend vaults meant for machine-to-machine communication. You cannot ask a Dify agent to "email the S3 link" to a client, as that link is likely private, lacks a user-friendly preview interface, or requires complex pre-signing logic that expires after a few minutes.

Fastio bridges this gap. It gives your Dify agents a user-facing filesystem that looks and feels like a professional product. Agents can write files to a workspace, and Fastio instantly generates a branded, secure download portal for the end user. This allows your AI agents to act as "delivery drivers," moving data from your private backend storage to a public or semi-private interface that humans can actually interact with safely.

Fastio features

Give Your AI Agents Persistent Storage

Stop trapping agent outputs in S3. Use Fastio to let your agents build portals, share files, and collaborate with humans.

Step 2: Integrating Fastio with Dify

You can give your Dify agents access to Fastio using Dify's Custom Tool feature. This allows the agent to create workspaces, upload files, and retrieve share links via the Fastio API. It effectively gives your LLM "hands" to manage files on behalf of the user.

1. Get Your API Key

Sign up for a free Fastio developer account. You get 50GB of storage and 5,000 credits per month, which is plenty for building, testing, and even deploying several production-grade multi-agent workflows.

2. Create the Custom Tool Specification

In Dify, navigate to Tools > Custom > Create Custom Tool. Import the Fastio OpenAPI schema (or a simplified subset for the operations you need). This schema defines the parameters the agent needs to provide, such as file_path or workspace_id.

Example Schema Definition (JSON/YAML):

openapi: 3.0.0
info:
  title: Fastio Agent Storage
  version: 1.0.0
paths:
  /v1/files/upload:
    post:
      operationId: uploadFile
      summary: Upload a file and get a shareable link
      ...

3. Add to Workflow

Once the tool is added, you can drag the Fastio tool into any workflow.

  • Input: The file variable from a previous node (e.g., an image generated by DALL-E 3 or a large PDF report created by a Python tool).
  • Output: A clean, public (or password-protected) URL that remains active until your agent or a user decides to delete it. Your LLM can then output: "I've generated the sales report you requested. You can download the full version here: [Fastio Link]. This link will remain active for 30 days."
Visual representation of an AI agent connecting to external tools

Advanced Workflow: The 'Drop-Off' Pattern

A powerful pattern for Dify agents is the Secure Drop-Off. Instead of emailing large attachments that might be blocked by mail servers or flagged as spam, the agent creates a temporary shared workspace for the user. 1.

User Request: "Analyze these CSVs and give me the clean versions." 2.

Agent Processing: The Dify workflow runs a Python script to clean the data and performs complex statistical analysis. 3.

Storage: The agent saves the cleaned CSVs and a detailed PDF summary report to a new Fastio folder named project-{date}. 4.

Delivery: The agent adds the user's email to the folder's access list, ensuring that only the authorized recipient can view the sensitive data. 5.

Notification: The user receives a branded email invite to the portal. 6.

Audit Trail: You can track exactly when the user accesses the files, providing a full audit trail of the agent's delivery success and user engagement. This approach keeps data secure, avoids technical email size limits, and provides a professional experience that is all automated by your Dify agent without manual human intervention.

UI card showing a file being shared with specific team members

Frequently Asked Questions

Can Dify use Google Drive for backend storage?

Yes, Dify supports Google Cloud Storage (GCS) as a native backend. However, using personal Google Drive accounts for system storage is not supported. For agent-facing file access, you can use Fastio's URL Import feature to pull files from Google Drive into your agent's workspace.

How does Fastio differ from S3 for Dify?

S3 is 'object storage' meant for applications to store raw data. Fastio is 'agent storage' meant for collaboration. S3 holds the database; Fastio handles the delivery. Fastio provides branded portals, share links, and file previews that S3 does not offer out of the box.

Is there a free tier for Dify storage?

Dify's open-source version is free to self-host, but you pay for your own underlying infrastructure (S3 costs). Fastio offers a generous free tier for agents: 50GB of storage and 5,000 monthly operation credits with no credit card required.

Related Resources

Fastio features

Give Your AI Agents Persistent Storage

Stop trapping agent outputs in S3. Use Fastio to let your agents build portals, share files, and collaborate with humans.