Skip to content

Code Execution with MCP: Building More Efficient Agents

Published: Nov 04, 2025 Authors: Written by Adam Jones and Conor Kelly. Thanks to Jeremy Fox, Jerome Swannack, Stuart Ritchie, Molly Vorwerck, Matt Samuels, and Maggie Vo for feedback.


Introduction

The Model Context Protocol (MCP) is an open standard for connecting AI agents to external systems. Since its November 2024 launch, the community has built thousands of MCP servers, SDKs are available for all major programming languages, and the industry has adopted MCP as the standard for agent-to-tool connections.

As developers connect agents to hundreds or thousands of tools, loading all tool definitions upfront and passing intermediate results through the context window slows agents and increases costs.

Excessive Token Consumption from Tools

Two common patterns increase agent cost and latency as MCP usage scales:

1. Tool Definitions Overload the Context Window

Most MCP clients load all tool definitions directly into context. Tool descriptions consume context window space, increasing response time and costs. With thousands of tools, agents must process hundreds of thousands of tokens before reading a request.

2. Intermediate Tool Results Consume Additional Tokens

When models directly call MCP tools, every intermediate result passes through the model. For example, transferring a meeting transcript from Google Drive to Salesforce requires the full transcript to flow through context twice. For a 2-hour sales meeting, that could mean an additional 50,000 tokens. Larger documents may exceed context window limits entirely, and models are more likely to make copying errors with large data.

The MCP client loads tool definitions into the model's context window and orchestrates a message loop where each tool call and result passes through the model between operations.

Code Execution with MCP Improves Context Efficiency

A solution is to present MCP servers as code APIs rather than direct tool calls. The agent writes code to interact with MCP servers, loading only needed tools and processing data in the execution environment before returning results.

One approach generates a file tree of all available tools from connected MCP servers (implemented in TypeScript), where each tool corresponds to a file with typed interfaces. The agent discovers tools by exploring the filesystem — listing directories to find available servers, then reading specific tool files to understand interfaces.

Cloudflare published similar findings, referring to this as "Code Mode." The core insight: "LLMs are adept at writing code and developers should take advantage of this strength."

This reduces token usage from 150,000 tokens to 2,000 tokens — "a time and cost saving of 98.7%."

Benefits of Code Execution with MCP

Progressive Disclosure

Models navigate filesystems well. Presenting tools as code on a filesystem lets models read definitions on-demand. A search_tools tool can also find relevant definitions, with detail level parameters (name only, name and description, or full schemas).

Context-Efficient Tool Results

With large datasets, agents can filter and transform results in code before returning them. For example, fetching a 10,000-row spreadsheet and filtering in the execution environment means the agent sees five rows instead of 10,000. Similar patterns work for aggregations, joins across data sources, or field extraction.

More Powerful and Context-Efficient Control Flow

Loops, conditionals, and error handling use familiar code patterns rather than chaining individual tool calls. This also reduces "time to first token" latency since the code execution environment evaluates conditionals rather than the model.

Privacy-Preserving Operations

Intermediate results stay in the execution environment by default. The agent only sees what is explicitly logged or returned. For sensitive workloads, the MCP client can tokenize sensitive data automatically (PII like emails, phone numbers, and names), so real data flows between systems but never passes through the model. Deterministic security rules can define where data flows.

State Persistence and Skills

Code execution with filesystem access lets agents maintain state across operations — writing intermediate results to files, enabling resumption and progress tracking. Agents can also persist code as reusable functions, saving implementations for future use. Adding a SKILL.md file creates structured skills that models can reference, building a toolbox of higher-level capabilities over time.

Caveat

"Running agent-generated code requires a secure execution environment with appropriate sandboxing, resource limits, and monitoring." These infrastructure requirements add operational overhead. Benefits should be weighed against implementation costs.

Summary

MCP provides a foundational protocol for agents to connect to tools and systems. When too many servers are connected, tool definitions and results consume excessive tokens. While the problems feel novel, they have known solutions from software engineering. Code execution applies established patterns to agents, using familiar programming constructs for more efficient MCP interaction.

Anthropic encourages implementers to share findings with the MCP community at modelcontextprotocol.io.

AI 落地咨询
艾维禾砺数字科技

企业 AI 落地全链路服务

Agent 开发工作流搭建Claude Code 集成
微信咨询
d187l8801b6124
访问官网 ivheli.com