What Is Claude Code?
Claude Code is a client application that provides an Anthropic‑compatible interface for interacting with large language models (LLMs). It is designed for developers who want to write, test, and run code‑centric prompts while keeping data and execution under their control.
What Is Docker Model Runner?
Docker Model Runner (DMR) is a Docker‑based service that hosts LLMs locally and exposes an Anthropic‑compatible API. By running models in containers, DMR offers isolation, reproducibility, and predictable cost without relying on external cloud endpoints.
Why Combine Claude Code with Docker Model Runner?
- Data sovereignty: All prompts and responses stay on your hardware.
- Cost predictability: No per‑token fees from third‑party APIs.
- Performance control: Adjust CPU, GPU, and memory allocations per model.
- Security: Containers isolate the model runtime from the host system.
- Flexibility: Swap models or adjust context size without changing Claude Code.
How to Install Claude Code
Claude Code provides platform‑specific installers.
- macOS / Linux:
curl -fsSL | bash - Windows PowerShell:
irm | iex
How to Set Up Docker Model Runner
1. Install Docker Engine (Docker Desktop or Docker Engine CLI).
2. Pull a model you wish to run, e.g., docker model pull gpt-oss.
3. (Optional) Repackage the model with a larger context size (see next section).
4. Start the Model Runner service:
- For Docker Desktop users, enable TCP access:
docker desktop enable model-runner --tcp - The service will listen at
by default.
How to Increase Model Context Size
Some coding models benefit from extended context windows. Docker Model Runner can repackage any model with a custom token limit.
- Pull the base model:
docker model pull gpt-oss - Repackage with a larger context (e.g., 32 k tokens):
docker model package --from ai/gpt-oss --context-size 32000 gpt-oss:32k - Run Claude Code against the new image by pointing to the same API endpoint.
How to Connect Claude Code to Docker Model Runner
Claude Code reads the ANTHROPIC_BASE_URL environment variable to determine the API endpoint.
- Set the variable in your shell (temporary):
export ANTHROPIC_BASE_URL= - For a permanent setup, add the export line to
~/.bashrc,~/.zshrc, or the Windows equivalent.
After setting the variable, launch Claude Code as usual; it will route all requests to the local Docker Model Runner.
How to Monitor Requests Sent by Claude Code
Docker Model Runner includes a built‑in request logger.
- Run:
docker model requests --model gpt-oss:32k | jq . - The command streams raw JSON payloads, useful for debugging API compatibility or inspecting token usage.
Why Persist the Configuration?
Persisting ANTHROPIC_BASE_URL ensures every new terminal session automatically targets your local runner, eliminating the risk of accidentally sending data to a remote service.
- Add the export line to your shell profile.
- Verify with
echo $ANTHROPIC_BASE_URLafter opening a new terminal.
Best Practices and Additional Resources
- Regularly update Docker Model Runner and model images to receive security patches.
- Use Docker’s resource limits (CPU, memory, GPU) to prevent a runaway model from exhausting host resources.
- Consider enabling TLS on the Model Runner API for added network security, especially if accessed from other machines.
- Explore Docker’s MCP Catalog for pre‑packaged, hardened model images.
- Read the companion posts: “OpenCode with Docker Model Runner for Private AI Coding” and “Docker Model Runner GA Announcement”.