How do I migrate data from S3 to Fastio?

The migration process involves four steps: audit your S3 bucket to inventory all objects, map the bucket structure to Fastio workspaces, sync files using parallel uploads (script provided above), and verify checksums to ensure data integrity. For zero-downtime migration, use dual-write or read-through caching during the transition.

Can I import an S3 bucket into Fastio?

Fastio doesn't directly import S3 buckets, but you can migrate data programmatically using the API. Use the audit-sync-verify workflow described in this guide. For large buckets, use chunked uploads (up to multiple per file) and parallel workers to speed up the process. The free agent tier provides multiple storage and multiple monthly credits for migration work.

What's the fast way to migrate large files?

Use Fastio's chunked upload API for files over multiple. Run multiple upload workers in parallel to maximize throughput. The API handles high-throughput transfers well. For large datasets, consider using the URL Import feature to pull files directly from S3 without downloading to local storage first.

Will my folder structure be preserved?

Yes, folder structure maps directly from S3 key prefixes to Fastio workspace folders. Map your S3 prefixes to workspaces during the planning phase, and the sync script preserves nested folder hierarchies within each workspace.

How do I avoid downtime during migration?

Use dual-write during migration: write new files to both S3 and Fastio, migrate historical files in the background, then switch reads to Fastio. Alternatively, use Fastio as a read-through cache that proxies to S3 for cache misses. Both approaches keep services running throughout the transition.

How to Migrate S3 Buckets to Fastio API - Guide

Why Migrate from S3 to Fastio?

Amazon S3 provides solid object storage, but it lacks the collaborative and intelligent features that modern development teams need. Fastio adds workspace intelligence on top of storage, giving you built-in RAG, semantic search, and multiple MCP tools that let AI agents work directly in your file system.

The main advantages of moving to Fastio include automatic file indexing (no setup required), native AI chat with citations, and agent-first architecture where every UI capability has a corresponding API endpoint. You also get branded sharing, data rooms, and real-time collaboration features that S3 doesn't offer.

For teams running AI agents, Fastio provides a significant advantage: agents can create workspaces, upload outputs, query documents with built-in RAG, and transfer ownership to humans when tasks complete. This workflow isn't possible with raw S3 storage.

Helpful references: Fastio Workspaces, Fastio Collaboration, and Fastio AI.

How to Audit Your S3 Bucket

Before migrating, you need a complete inventory of what exists in your S3 bucket. This audit forms the foundation for your migration script and helps you plan workspace mapping.

Create a script that lists all objects and captures key metadata. You'll want to capture the full object key (which maps to folder structure), file size, last modified date, and any custom metadata tags. Run this during off-peak hours to avoid performance impact on your production S3 bucket.

The audit should also identify large files that will need chunked uploads, detect any naming conflicts (S3 allows duplicate keys with different casing in some configurations), and catalog file types to plan appropriate workspace organization.

S3 Audit Script Example

import boto3
import json

s3 = boto3.client('s3')
paginator = s3.get_paginator('list_objects_v2')

objects = []
for page in paginator.paginate(Bucket='your-bucket'):
    if 'Contents' in page:
        for obj in page['Contents']:
            objects.append({
                'key': obj['Key'],
                'size': obj['Size'],
                'last_modified': obj['LastModified'].isoformat(),
                'etag': obj['ETag']
            })

with open('s3-audit.json', 'w') as f:
    json.dump(objects, f, indent=2)

This script exports your entire bucket to a JSON file that you can analyze to plan workspace mapping.

How to Map S3 Structure to Fastio Workspaces

Fastio uses workspaces as the primary organizational unit, which map naturally from S3 bucket prefixes. Analyze your S3 audit data and group objects by their key prefixes. Each prefix becomes a candidate workspace in Fastio.

Consider these mapping strategies: project-based grouping (all files for a client or product in one workspace), team-based organization (files organized by department or function), or hybrid approaches that combine both. Fastio workspaces support unlimited members and nested folder structures, giving you flexibility.

Create your workspace hierarchy before uploading files. Use the Fastio API to create workspaces programmatically:

Workspace Creation API Call

import requests

# Create workspace
response = requests.post(
    'https://api.fast.io/current/workspaces/',
    headers={'Authorization': 'Bearer YOUR_JWT'},
    data={
        'name': 'Project Alpha Files',
        'folder_name': 'project-alpha',
        'description': 'Migration from S3 bucket project-alpha'
    }
)
workspace = response.json()
workspace_id = workspace['id']

Give Your AI Agents Persistent Storage

Get 50GB free storage, 5,000 monthly credits, and 251 MCP tools for your agents. No credit card required. Built for migrate buckets fast api workflows.

Execute the Migration Sync

With workspaces created, you can now sync files from S3 to Fastio. For small files under multiple, use simple single-request uploads. For larger files, Fastio supports chunked uploads up to multiple per file, which is essential for media files and datasets.

Parallel uploads dramatically speed up migration. Run multiple upload workers concurrently, each handling different files. The Fastio API handles high-throughput transfers well, and you can tune parallelism based on your network capacity and credit usage.

Here's a sync script that handles the migration:

Run a small pilot first, then expand in phases while tracking data integrity and performance baselines. This keeps migration risk low and gives teams time to adjust safely.

Migration Script with Parallel Uploads

import asyncio
import aiohttp
from concurrent.futures import ThreadPoolExecutor
import boto3

S3_BUCKET = 'your-bucket'
FASTIO_WORKSPACE_ID = 'workspace-id'
MAX_WORKERS = 10  # Parallel upload count

s3 = boto3.client('s3')

async def upload_file(session, s3_key):
    # Download from S3 to memory
    s3_obj = s3.get_object(Bucket=S3_BUCKET, Key=s3_key)
    content = s3_obj['Body'].read()
    
    # Upload to Fastio
    url = f'https://api.fast.io/current/storage/{FASTIO_WORKSPACE_ID}/upload'
    headers = {'Authorization': 'Bearer YOUR_JWT'}
    
    files = {'file': (s3_key.split('/')[-1], content)}
    async with session.post(url, headers=headers, files=files) as resp:
        return await resp.json()

async def migrate_all():
    # Get S3 object list
    objects = get_s3_objects()  # From audit step
    
    async with aiohttp.ClientSession() as session:
        tasks = [upload_file(session, obj['Key']) for obj in objects]
        results = await asyncio.gather(*tasks, return_exceptions=True)
    
    return results

asyncio.run(migrate_all())

Verify Data Integrity

After migration completes, verify that all files arrived correctly. The most reliable method is checksum comparison. Store MD5 or SHA256 checksums during your S3 audit, then calculate checksums on the Fastio side after upload.

Use the Fastio API to list uploaded files and compare against your audit data. Flag any missing files, size mismatches, or checksum failures. Most sync issues stem from network timeouts or rate limiting, and a second pass usually resolves them.

Here's how to verify integrity:

Run a small pilot first, then expand in phases while tracking data integrity and performance baselines. This keeps migration risk low and gives teams time to adjust safely.

Checksum Verification Script

import hashlib
import requests
import json

def calculate_md5(data):
    return hashlib.md5(data).hexdigest()

# Verify a single file
def verify_file(workspace_id, s3_key, expected_md5):
    # Download from Fastio
    url = f'https://api.fast.io/current/storage/{workspace_id}/download/{s3_key}'
    headers = {'Authorization': 'Bearer YOUR_JWT'}
    
    response = requests.get(url, headers=headers)
    actual_md5 = calculate_md5(response.content)
    
    return actual_md5 == expected_md5.replace('"', '')

# Run verification on all migrated files
def verify_migration(audit_file, workspace_id):
    with open(audit_file) as f:
        objects = json.load(f)
    
    failed = []
    for obj in objects:
        if not verify_file(workspace_id, obj['key'], obj['etag']):
            failed.append(obj['key'])
    
    return failed

Strategies for Zero-Downtime Migration

For production systems, you need a strategy that keeps services running during the migration. The dual-write approach works well: configure your application to write to both S3 and Fastio simultaneously during the transition period.

Start by enabling dual-write for new files only. Existing files migrate in the background using your sync script. Once the historical migration completes and verification passes, switch read operations to Fastio while maintaining S3 as backup. After a stability period (typically multiple-multiple weeks), decommission S3 writes.

Another approach is the read-through cache: serve reads from Fastio but proxy to S3 for any cache misses. This lets you migrate gradually while users transparently access files from both systems. Fastio's CDN-backed delivery makes this particularly effective for file-heavy applications.

Post-Migration: using Fastio Features

Once files are in Fastio, enable Intelligence Mode to activate built-in RAG. When enabled, workspace files are automatically indexed and become searchable by meaning, not just filename. You can ask questions like "show me the contract from Q3" and Fastio will find relevant files.

Your migrated workspaces now have access to all multiple MCP tools. Agents can create shares, set up webhooks for file change notifications, and build automated workflows. The free agent tier includes multiple storage and multiple credits monthly, enough for significant automation work.

Consider setting up webhooks to trigger downstream actions when files arrive. For example, when an agent uploads model outputs, a webhook can notify your evaluation pipeline or trigger processing workflows. This turns passive storage into an active part of your agent infrastructure.

How to Migrate S3 Buckets to Fastio API

Why Migrate from S3 to Fastio?

How to Audit Your S3 Bucket

S3 Audit Script Example

How to Map S3 Structure to Fastio Workspaces

Workspace Creation API Call

Give Your AI Agents Persistent Storage

Execute the Migration Sync

Migration Script with Parallel Uploads

Verify Data Integrity

Checksum Verification Script

Strategies for Zero-Downtime Migration

Post-Migration: using Fastio Features

Frequently Asked Questions

Related Resources

Give Your AI Agents Persistent Storage