Code Mode is a technique that lets an AI model write short JavaScript snippets against a typed SDK instead of listing every API operation as a separate tool, dramatically cutting the number of tokens needed in the model context.
Core Tools: search() and execute()
The server exposes only two functions. search() lets the model explore the OpenAPI specification, while execute() runs the code that calls the real API. This design keeps the prompt size constant even as the API grows.
- Both tools accept a single
codefield containing an async arrow function. search()receives a fully resolved OpenAPI spec object namedspec.execute()provides acloudflare.request()client for authenticated calls.- The functions run inside a lightweight V8 sandbox with no file system access.
- Outbound network requests must be declared through explicit fetch handlers.
Typed SDK and Dynamic Worker Loader
The SDK supplies TypeScript types that mirror the Cloudflare OpenAPI definitions, so the model can autocomplete endpoint names and payload shapes. The Dynamic Worker Loader creates an isolated environment for each code snippet.
- Types are generated from the spec, guaranteeing correct parameter names.
- Code runs as an async function, returning plain JSON data.
- Worker isolates enforce time limits and memory caps.
- Errors are captured and returned as structured messages.
- Developers can extend the SDK by adding custom helper functions.
Security Model
Running user‑generated code poses risks, so the platform enforces strict sandboxing. No environment variables, no filesystem, and disabled global fetch by default keep the execution safe.
- All I/O is routed through the provided
cloudflare.request()wrapper. - Fetch handlers must be whitelisted per endpoint.
- Sandboxed code cannot import external modules.
- Execution time is limited to a few seconds.
- Result data is sanitized before returning to the model.
Performance Benefits
By replacing thousands of tool definitions with two generic functions, token usage drops from millions to roughly one thousand per request. This frees up space for the user’s actual query and improves response speed.
- Token count for a full Cloudflare API call drops by 99.9%.
- Model can focus on task logic rather than tool selection.
- Reduced prompt size speeds up inference on large models.
- Consistent tool footprint simplifies server scaling.
- Lower token usage reduces cost on usage‑based pricing.
Getting Started
To try Code Mode, import the open‑source SDK from the Cloudflare Agents repository and point your MCP client at the new server endpoint. Use the search() tool to locate needed endpoints, then call execute() to perform the actions.
- Install the SDK with
npm i @cloudflare/agents-sdk. - Initialize the MCP client with your API token.
- Run a
search()snippet to list OpenAPI paths. - Write an
execute()snippet that callscloudflare.request()for the desired operation. - For JavaScript performance tips see accelerate JavaScript with Bun and for web‑compatibility guidance read Web Interoperability 2024.