AI & Agents

How to Validate MCP Tool Inputs: Best Practices for Reliable Agents

Guide to validating mcp tool inputs: Input validation prevents AI agents from crashing or performing unsafe actions. This guide covers how to implement strict schema validation for MCP tools using Zod and JSON Schema, ensuring your agents operate securely and reliably.

Fast.io Editorial Team 9 min read
Validation acts as a firewall between agent logic and external tool execution.

Why Validation is Critical for MCP Agents

Input validation is the first line of defense for Model Context Protocol (MCP) servers. It ensures that the data provided by an AI agent matches the strict format your code expects before any execution occurs. In traditional software engineering, inputs often come from trusted or deterministic sources like frontend forms or internal services. In the agentic world, the caller is a Large Language Model (LLM), a probabilistic engine that guesses the next token.

This requires a different approach to validation. An LLM might hallucinate a parameter that doesn't exist, provide a string description of a number instead of the integer itself (e.g., "five" instead of multiple), or attempt to traverse directories with paths like ../../etc/password. Without a strong validation layer, these inputs can lead to runtime crashes, security breaches, or silent data corruption.

The Cost of Failure

According to recent reliability analyses of autonomous agents, hallucinated arguments are a leading cause of task failure. When an MCP tool crashes due to an unhandled exception (like trying to read a file at undefined), the entire conversation context is often lost or the agent gets stuck in a loop. By moving validation to the system boundary, you convert these fatal crashes into structured, informative error messages. This allows the agent to "read" the error, understand its mistake, and self-correct in the next turn, turning a potential failure into a success.

Helpful references: Fast.io Workspaces, Fast.io Collaboration, and Fast.io AI.

What to check before scaling validating mcp tool inputs

The Model Context Protocol uses JSON Schema as the universal language for defining tool interfaces. When your server receives a ListTools request, it must respond with a JSON object that describes every available tool, including the exact shape of its arguments.

This schema has two purposes:

  1. Instruction: It tells the LLM what tools are available and how to use them. The more descriptive your schema, the better the LLM understands the tool's purpose.
  2. Enforcement: It acts as a contract. Your server code must verify that the incoming CallTool request strictly adheres to this contract.

Key Schema Components for Agents

  • Type Constraints: Explicitly defining string, number, boolean, or array. This prevents the "stringly typed" problem where numbers arrive as text.
  • Descriptions: Detailed descriptions are the most important part of "soft validation." A description like "The ID of the user" is weak; "The UUIDv4 string representing the user, found in the profile URL" is strong. It guides the model to the correct value before validation even runs.
  • Enums: Restricting string inputs to a known set of values (e.g., ["json", "csv", "xml"]) is safer than accepting any string and validating it later. Enums are token-efficient and easy to enforce.
Diagram showing the flow of JSON Schema definitions in MCP

Implementing Strict Validation with Zod (Node.js)

For Node.js and TypeScript-based MCP servers, Zod is a standard library for schema definition. It offers a clean API that is easy to read and handles complex validation logic. Libraries like zod-to-json-schema allow you to define your schema once in code and automatically generate the JSON Schema required by the MCP protocol.

Here is a pattern for creating a validated tool using Zod.

1. Define the Schema

Start by creating a Zod object that defines the shape of your tool's input. Note the use of .describe(), which goes directly to the LLM.

import { z } from "zod";

const ReadFileSchema = z.object({
  path: z.string()
    .min(1, "Path cannot be empty")
    .startsWith("/", "Path must be absolute")
    .describe("The absolute path to the file to read. Must start with /."),
  encoding: z.enum(["utf-8", "ascii", "base64"])
    .default("utf-8")
    .describe("The encoding to use for the file content."),
  line_range: z.array(z.number())
    .length(2, "Must specify start and end lines")
    .optional()
    .describe("Optional: An array of [start, end] line numbers to read (multiple-based).")
});

// Infer the TypeScript type from the schema
type ReadFileArgs = z.infer<typeof ReadFileSchema>;

2. Validate in the Handler

When the tool is called, use Zod's safeParse method. Unlike parse, which throws an error, safeParse returns a result object that lets you handle errors gracefully without crashing the server.

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === "read_file") {
    // 1. Validate Input
    const result = ReadFileSchema.safeParse(request.params.arguments);

if (!result.success) {
      // 2. Format Error for the Agent
      const errorMessages = result.error.issues
        .map(i => `${i.path.join(".")}: ${i.message}`)
        .join("; ");
        
      return {
        content: [{
          type: "text",
          text: `Validation Error: ${errorMessages}. Please correct your arguments and try again.`
        }],
        isError: true, // Signals to the LLM that the tool call failed
      };
    }

// 3. Execute with Safe Data
    // result.data is now typed as ReadFileArgs and guaranteed correct
    const { path, encoding, line_range } = result.data;
    
    try {
      const content = await safeReadFile(path, encoding, line_range);
      return {
        content: [{ type: "text", text: content }],
        isError: false,
      };
    } catch (err) {
      return {
        content: [{ type: "text", text: `System Error: ${err.message}` }],
        isError: true,
      };
    }
  }
});

Advanced Validation Patterns

Basic type checking is often not enough for production agents. You may need to validate relationships between fields or transform inputs into usable formats. Zod handles these patterns well.

Discriminated Unions for Multi-Mode Tools

Sometimes a tool behaves differently based on a "mode" parameter. Use discriminated unions to enforce different required fields for each mode. This is clearer for the LLM than a single object with many optional fields.

const SearchToolSchema = z.discriminatedUnion("type", [
  z.object({
    type: z.literal("keyword"),
    query: z.string(),
    fuzzy: z.boolean().optional()
  }),
  z.object({
    type: z.literal("vector"),
    embedding: z.array(z.number()).length(1536),
    threshold: z.number().min(0).max(1)
  })
]);

Refinements for Logic Checks

Use .refine() to enforce logic that cannot be expressed in standard JSON Schema types, such as ensuring a start date is before an end date.

const DateRangeSchema = z.object({
  start: z.string().date(),
  end: z.string().date()
}).refine(data => new Date(data.start) < new Date(data.end), {
  message: "End date must be after start date",
  path: ["end"] // Attaches error to the 'end' field
});

Transformations

Use .transform() to sanitize inputs automatically. For example, trimming whitespace or normalizing URL formats before your handler logic ever sees the data.

const UrlSchema = z.string()
  .url()
  .transform(url => new URL(url).hostname) // Extract hostname immediately
  .describe("The URL to analyze (will be normalized to hostname)");

Security: Sanitization Beyond Schemas

Validation ensures the data is the right shape, but sanitization ensures it is safe to execute. This is particularly important when your MCP tool interacts with the file system, shell, or external databases.

Path Traversal Prevention

A common vulnerability is when an agent attempts to access sensitive files using relative paths like ../../. Even if you validate that the input is a string, you must also validate the resolved path.

import path from "path";

function validateSafePath(userInput: string, allowedRoot: string) {
  // 1. Resolve the absolute path
  const resolved = path.resolve(allowedRoot, userInput);
  
  // 2. Check if it starts with the allowed root
  if (!resolved.startsWith(allowedRoot)) {
    throw new Error(`Access denied: ${userInput} is outside the allowed directory.`);
  }
  
  return resolved;
}

Command Injection

If your tool executes system commands, avoid exec (which spawns a shell) whenever possible. Instead, use execFile or spawn which pass arguments directly to the process, bypassing the shell's argument parsing. This prevents attacks where an agent injects commands via separators like ; or &&.

Database Query Injection

Agents can inject malicious query fragments, just like SQL injection in web apps. Always use parameterized queries or ORMs (like Prisma or TypeORM) rather than constructing query strings from agent inputs.

The Self-Correction Loop

Strong validation does more than block bad inputs; it teaches. When an agent receives a helpful error message, it learns from the mistake.

Consider a scenario where an agent calls a tool get_weather(city: "Paris").

Scenario A (Weak Validation):

  • Server Code: if (!args.coords) throw new Error("Internal Error");
  • Agent receives: Error: Internal Error
  • Agent reaction: Confusion. It tries get_weather(location: "Paris") or gives up.

Scenario B (Strong Validation):

  • Server Code: Zod schema requires latitude and longitude.
  • Agent receives: Validation Error: Missing required properties: 'latitude', 'longitude'.
  • Agent reaction: "Ah, I need coordinates, not a city name." The agent then calls a geocode_city tool first, gets the coordinates, and successfully calls get_weather.

This loop (Try → Fail (with info) → Correct → Succeed) is central to autonomous systems. By using descriptive schemas and specific error messages, you build agents that are resilient, adaptable, and capable of handling complex workflows.

Flowchart showing an agent correcting its input after receiving a validation error

Frequently Asked Questions

How do I validate JSON arguments in MCP?

The most effective way is to use a schema validation library like Zod (for Node.js) or Pydantic (for Python). These libraries allow you to define strict types and constraints, parse the incoming JSON arguments against the schema, and automatically generate descriptive error messages if the input is invalid.

What is the best library for MCP schema validation?

For TypeScript and Node.js servers, Zod is widely considered the best choice due to its developer-friendly API, strong type inference, and ability to export to JSON Schema via the `zod-to-json-schema` package. For Python-based MCP servers, Pydantic is the industry standard.

How do I prevent command injection in MCP tools?

To prevent command injection, avoid using shell execution functions that accept a raw command string (like `exec` in Node.js) with concatenated input. Instead, use functions like `execFile` or `spawn` that accept arguments as a separate array. Also, validate all string inputs against a strict allowlist of characters whenever possible.

What happens if an agent sends invalid data?

If validation is implemented correctly, the server should catch the invalid data before execution and return a structured error message to the agent. The agent can then read this error message, understand what went wrong (e.g., 'missing required field'), and attempt the tool call again with corrected parameters.

Should I validate inputs on the client or server side?

In the MCP architecture, the 'client' is the LLM host (like Claude Desktop) and the 'server' is your MCP tool. You must always validate on the server side. You cannot trust the client to enforce the schema perfectly, as LLMs are probabilistic and may occasionally deviate from the instructions.

Related Resources

Fast.io features

Secure Storage for Your Agent's Outputs

Fast.io provides an intelligent workspace where your agents can safely store, index, and retrieve files using 251+ pre-built MCP tools. Built for validating mcp tool inputs workflows.