Skip to main content
Get Started
Use Cases

Code Mode (MCP)

Give AI agents a single code-execution tool instead of individual MCP tools. The LLM writes code that chains binding calls, executed safely in Secure Exec.

Example on GitHub

Full working example: the LLM chains real host binding calls in one sandboxed run and returns a structured result.

Instead of calling MCP tools one at a time, Code Mode lets the LLM write JavaScript that orchestrates everything in one go, run safely in a V8 sandbox by Secure Exec.

MCP Toolkit provides a premade Code Mode library powered by Secure Exec: experimental_codeMode: true. We recommend trying it first. The rest of this page covers how to implement Code Mode yourself.

Why Code Mode

  • 81% less token overhead: With 50 tools, replacing per-call tool descriptions with a single code-execution tool cuts tool description tokens by 81%
  • Fewer round-trips: Chain multiple tool calls, conditionals, and data transformations in a single execution
  • Real control flow: Loops, branching, and Promise.all, not a chain of isolated tool calls
  • One structured result: The LLM returns a single JSON value via globalThis.__return(), decoded on the host as result.value

How it works

  1. Register your host bindings on the host with NodeRuntime.create({ bindings }). Each becomes a named command inside the sandbox.
  2. Give the LLM one tool (“execute code”) and feed its generated JavaScript to rt.run().
  3. The generated code invokes your bindings by name. Each call round-trips out of the sandbox, runs the binding’s host handler, and the handler’s return value comes back to the guest.
  4. The guest hands a single structured result back to the host with globalThis.__return(value), which rt.run() decodes as result.value.

Host bindings are the heart of Code Mode. The handlers run on the host, never in the sandbox, so the guest gets controlled, named capabilities (the kind an AI agent calls as tools) without being granted the underlying access. Registering bindings auto-grants the binding permission scope; pass your own permissions.binding policy to gate individual bindings.

Register the host bindings

Each binding has a description, a JSON Schema inputSchema, and a handler. The handler receives the parsed input and returns a JSON-serializable result.

import { NodeRuntime } from "secure-exec";

function readStringField(input: unknown, field: string): string {
  if (!input || typeof input !== "object" || Array.isArray(input)) {
    throw new TypeError("binding input must be an object");
  }

  const value = (input as Record<string, unknown>)[field];
  if (typeof value !== "string") {
    throw new TypeError(`${field} must be a string`);
  }
  return value;
}

// Code Mode: instead of giving the LLM many individual bindings, you give it one
// "execute code" tool. The LLM writes JavaScript that chains binding calls,
// branches, and transforms data - then runs it in a single sandboxed pass.
//
// The heart of Code Mode is real bindings. You register them on the host with
// create({ bindings }); each becomes a named command inside the sandbox. When the
// guest invokes a binding by name with JSON input, the call round-trips back to the
// host, runs the binding's handler, and the handler's return value is delivered
// back to the guest. The guest never sees the host filesystem, network, or any
// capability beyond the named bindings you grant it.

// Register the bindings. These handlers run on the HOST, not in the sandbox.
// In a real app each handler would hit a database, an API, or a service; here we
// keep them small and deterministic so the example is easy to follow.
const rt = await NodeRuntime.create({
  bindings: {
    "get-weather": {
      description: "Look up the current temperature for a city",
      inputSchema: {
        type: "object",
        properties: { city: { type: "string" } },
        required: ["city"],
      },
      handler: (input) => {
        const city = readStringField(input, "city");
        const table: Record<string, { temp_f: number }> = {
          "San Francisco": { temp_f: 61 },
          Tokyo: { temp_f: 75 },
        };
        return table[city] ?? { temp_f: null };
      },
    },
    calculate: {
      description: "Evaluate a simple arithmetic expression",
      inputSchema: {
        type: "object",
        properties: { expression: { type: "string" } },
        required: ["expression"],
      },
      handler: (input) => {
        const expression = readStringField(input, "expression");
        return { result: Number(eval(expression)) };
      },
    },
  },
});

The agent’s generated code

The agent then generates code like this (call it llmGeneratedCode). The guest calls each binding with the callBinding(name, input) global, which resolves with the host handler’s return value. It chains three binding calls with real control flow (Promise.all, arithmetic, branching) in one execution, then returns a single structured result:

// Imagine this string was written by the LLM. It chains three host binding calls
// with real control flow (Promise.all, arithmetic, branching) in one execution,
// then hands a single structured result back to the host. callBinding resolves
// with the host handler's return value.
const llmGeneratedCode = `
const [sf, tokyo] = await Promise.all([
  callBinding("get-weather", { city: "San Francisco" }),
  callBinding("get-weather", { city: "Tokyo" }),
]);

const diffF = Math.abs(sf.temp_f - tokyo.temp_f);
const diffC = await callBinding("calculate", { expression: \`\${diffF} * 5 / 9\` });

console.log("chained 3 binding calls in one sandbox execution");

globalThis.__return({
  san_francisco: sf,
  tokyo: tokyo,
  difference: { fahrenheit: diffF, celsius: diffC.result },
  warmer: sf.temp_f > tokyo.temp_f ? "San Francisco" : "Tokyo",
});
`;

Run it and read the result

Run the LLM’s code in one sandboxed pass and read back the structured result:

interface CodeModeResult {
  san_francisco: { temp_f: number };
  tokyo: { temp_f: number };
  difference: { fahrenheit: number; celsius: number };
  warmer: string;
}

try {
  // rt.run() executes the guest code and decodes whatever it passes to
  // globalThis.__return(), while still capturing stdout/stderr/exitCode.
  const result = await rt.run<CodeModeResult>(llmGeneratedCode, {
    timeout: 5000,
  });

  console.log("exitCode:", result.exitCode);
  console.log("stdout:", result.stdout.trim());
  console.log("structured result:", JSON.stringify(result.value, null, 2));
} finally {
  await rt.dispose();
}

Three tool calls, one sandbox execution, zero extra LLM round-trips. Running it prints:

exitCode: 0
stdout: chained 3 binding calls in one sandbox execution
structured result: {
  "san_francisco": {
    "temp_f": 61
  },
  "tokyo": {
    "temp_f": 75
  },
  "difference": {
    "fahrenheit": 14,
    "celsius": 7.777777777777778
  },
  "warmer": "Tokyo"
}

Further reading