Deploying MCP Servers for AI Tool Integration: Protocol Design and Security Considerations

The Model Context Protocol (MCP) is an open standard that defines how AI models communicate with external tools and data sources. Rather than each AI application implementing its own bespoke integration for every tool it needs, MCP provides a uniform protocol: a server exposes a set of tools with typed schemas, and any MCP-compatible client (Claude, Cursor, custom agents) can discover and invoke those tools. This standardization has significant implications for how organizations build AI-powered workflows and the security posture required to protect them.

Protocol Architecture

MCP defines three core primitives:

  • Tools: Functions the AI can invoke with structured arguments. Analogous to API endpoints.
  • Resources: Static or dynamic data sources the AI can read. Files, database query results, API responses.
  • Prompts: Pre-defined prompt templates that clients can request from the server to standardize common interactions.

The protocol itself is JSON-RPC 2.0 over one of two transports:

  • stdio: The server runs as a subprocess of the client, communicating over stdin/stdout. Simple, zero network configuration, suitable for local development and trusted local tools.
  • HTTP with SSE: The server runs as an HTTP service. The client sends requests as HTTP POST, and the server may push notifications via Server-Sent Events. Suitable for shared, remote, or multi-tenant deployments.

Tool Manifest Design

The tool manifest is the contract between your MCP server and the AI client. Well-designed tool schemas enable the AI to use tools correctly; poorly designed schemas lead to incorrect arguments, failed calls, and confusing error handling.

Naming Conventions

Tool names should be action-oriented verbs describing what the tool does, not what it is. Use search_documents, not document_search or DocumentSearch. Use underscores; avoid camelCase or hyphens for consistency across your toolset.

Descriptions Matter

The AI uses the tool description — not the name — to decide when to invoke a tool. Write descriptions that explain the use case, not just what the tool does mechanically:

{
  "name": "execute_query",
  "description": "Run a read-only SQL SELECT query against the analytics database. Use this when the user asks about data, metrics, or historical records. Do NOT use for INSERT, UPDATE, DELETE, or DDL operations — those are not supported and will return an error.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "A valid SQL SELECT statement. Must not contain semicolons after the main query. Maximum 10,000 characters."
      },
      "timeout_seconds": {
        "type": "integer",
        "description": "Query timeout in seconds. Default 30. Maximum 120.",
        "default": 30,
        "minimum": 1,
        "maximum": 120
      }
    },
    "required": ["query"]
  }
}

Consistent Error Schemas

Define a consistent error structure across all tools so the AI can reason about failures uniformly:

class MCPToolError(TypedDict):
    error_code: str          # Machine-readable: PERMISSION_DENIED, TIMEOUT, etc.
    message: str             # Human-readable description
    suggestion: str | None   # Optional: what to try instead
    retry_after: int | None  # Optional: seconds to wait before retry

Transport Selection

stdio Transport

stdio is the simplest deployment model and appropriate for tools that:

  • Run locally on the same machine as the AI client
  • Access local resources (filesystem, local databases, local APIs)
  • Are trusted with the same permissions as the AI client process
  • Do not need to be shared across multiple clients simultaneously
# stdio server skeleton (Python)
import sys, json

def handle_request(request: dict) -> dict:
    method = request.get("method")
    if method == "tools/list":
        return {"tools": TOOL_MANIFEST}
    if method == "tools/call":
        return execute_tool(request["params"])
    return {"error": {"code": -32601, "message": "Method not found"}}

for line in sys.stdin:
    request = json.loads(line.strip())
    response = {"jsonrpc": "2.0", "id": request.get("id")}
    response.update(handle_request(request))
    sys.stdout.write(json.dumps(response) + "\n")
    sys.stdout.flush()

HTTP Transport

HTTP transport is required when:

  • Multiple AI clients share the same MCP server
  • The server hosts privileged capabilities that should not be co-located with client processes
  • You need centralized audit logging of all tool invocations
  • The server needs to push real-time updates (streaming tool results via SSE)

Authentication Patterns

stdio servers inherit the OS user’s permissions — authentication is the OS itself. HTTP servers require explicit authentication:

API Key Authentication

The simplest approach: clients include a bearer token in every request header. Suitable for server-to-server integrations where the client is a controlled system:

Authorization: Bearer mcp_prod_k8s_a3f9d2e1b7c4...

# Server-side validation
def validate_request(request: Request) -> str:
    auth = request.headers.get("Authorization", "")
    if not auth.startswith("Bearer "):
        raise HTTPException(401, "Missing bearer token")
    token = auth.removeprefix("Bearer ")
    client_id = token_store.validate(token)
    if not client_id:
        raise HTTPException(401, "Invalid token")
    return client_id

mTLS for Service-to-Service

For high-assurance environments, mutual TLS provides strong authentication with no shared secrets:

# Client certificate requirement in nginx upstream of MCP server
ssl_client_certificate /etc/ssl/certs/mcp-client-ca.crt;
ssl_verify_client on;
ssl_verify_depth 2;

Sandboxing Tool Execution

MCP tools that execute arbitrary code, run shell commands, or interact with external systems need sandboxing. An AI model that has been manipulated via prompt injection can attempt to call tools with malicious arguments.

Input Validation

Validate all tool arguments against the declared schema before execution. Never pass raw AI-provided strings to shell commands:

def execute_shell_tool(command: str, args: list[str]) -> str:
    # WRONG: subprocess.run(f"{command} {' '.join(args)}", shell=True)
    
    # RIGHT: allowlist commands, validate args, no shell=True
    ALLOWED_COMMANDS = {"ls", "cat", "grep", "find"}
    if command not in ALLOWED_COMMANDS:
        raise ToolError("COMMAND_NOT_ALLOWED", f"{command} is not permitted")
    
    # Validate each arg: no shell metacharacters
    for arg in args:
        if re.search(r'[;&|`$\(\)\{\}\[\]<>]', arg):
            raise ToolError("INVALID_ARGUMENT", "Shell metacharacters are not permitted in arguments")
    
    return subprocess.run(
        [command, *args],
        capture_output=True, text=True, timeout=30,
        cwd="/sandbox"  # Restrict working directory
    ).stdout

Container Isolation

For tools that execute untrusted code or perform potentially destructive operations, run the tool logic inside a short-lived container:

def execute_in_sandbox(code: str) -> str:
    container = docker_client.containers.run(
        image="sandbox:latest",
        command=["python3", "-c", code],
        remove=True,
        network_disabled=True,
        read_only=True,
        mem_limit="128m",
        cpu_quota=50000,   # 50% of one CPU
        timeout=10,
        volumes={"/tmp/sandbox_input": {"bind": "/input", "mode": "ro"}}
    )
    return container.decode("utf-8")

Rate Limiting

MCP servers can be called in tight loops by agents. Without rate limiting, a confused or manipulated agent can exhaust external API quotas, overload databases, or generate unexpected costs:

# Per-client rate limiting using Redis token bucket
async def check_rate_limit(client_id: str, tool_name: str) -> None:
    key = f"ratelimit:{client_id}:{tool_name}"
    limit = TOOL_RATE_LIMITS.get(tool_name, DEFAULT_LIMIT)
    
    current = await redis.incr(key)
    if current == 1:
        await redis.expire(key, 60)  # Reset window every 60 seconds
    
    if current > limit:
        retry_after = await redis.ttl(key)
        raise RateLimitError(
            f"Rate limit exceeded for {tool_name}",
            retry_after=retry_after
        )

Audit Logging

Every tool invocation in a production MCP server should be logged with enough detail for security review and debugging:

{
  "timestamp": "2026-04-05T08:23:41.123Z",
  "client_id": "claude-production-agent",
  "session_id": "sess_a3f9d2e1",
  "tool_name": "execute_query",
  "arguments": {"query": "SELECT count(*) FROM users WHERE created_at > '2026-01-01'"},
  "duration_ms": 142,
  "result_tokens": 24,
  "status": "success",
  "trace_id": "abc123"
}

Ship these logs to your SIEM. Anomaly detection rules that flag unusual tool invocation patterns — unexpectedly large result sets, queries to tables not normally accessed, high-frequency calls to destructive tools — provide an additional layer of defense against prompt injection attacks that attempt to abuse MCP capabilities.

Conclusion

MCP is a powerful protocol for connecting AI models to organizational capabilities, but it introduces a new trust boundary that requires careful engineering. Treat MCP servers as privileged services: authenticate clients, validate all inputs, sandbox dangerous operations, rate-limit tool calls, and log everything. The AI model is not always in control of the arguments it receives — prompt injection attacks can manipulate tool calls in ways that bypass application-level controls. Defense must be at the MCP server layer, not assumed from the client side.

Scroll to Top