OpenClaw Ollama configuration: 5 settings the wizard skips

OpenClaw Ollama configuration in five settings: port 11434 without /v1, OLLAMA_API_KEY, 64k context, the ollama/ prefix, and gateway 18789.

15 min read

OpenClaw mascot and Ollama logo flanking a glowing panel of URL KEY CONTEXT MODEL GATEWAY labels

TL;DR: OpenClaw Ollama configuration comes down to five values. Point OpenClaw at port 11434 without the OpenAI-compatibility path, set OLLAMA_API_KEY to any non-empty string, configure context length to at least 64,000 tokens, use the ollama/<model-tag> identifier format, and verify the gateway on port 18789. Get those five right and the rest is defaults.

OpenClaw Ollama configuration looks straightforward at first glance. One-line wizard, ollama launch openclaw, and you are chatting with a local model in two minutes. Then production hits and the wizard's blind spots start to bite.

Five values matter, and the wizard handles maybe three of them well. The shortcut runs on Ollama 0.17 or newer and auto-configures the provider, but it skips the things that quietly break later. Set these by hand:

  1. Base URL http://localhost:11434, never with /v1. The OpenAI-compatible path returns tool calls as plain JSON text instead of structured calls, and your agent silently loses tools.
  2. OLLAMA_API_KEY="ollama-local". Ollama does not check the key, but OpenClaw's provider layer refuses to load if the env var is missing.
  3. Context window at 64,000 tokens or more. OpenClaw rejects any model whose manifest advertises under 8,192, and 64k is the agentic minimum.
  4. Model identifier ollama/<tag>, for example ollama/qwen2.5-coder:14b. Bare names throw "Unknown model".
  5. Gateway on port 18789 with the token from ~/.openclaw/openclaw.json appended to the URL path.

Hardware floor: 16 GB RAM and 4 GB VRAM cover Qwen 2.5 or DeepSeek R1 locally. Below that, route through Ollama Cloud's free tier or drop to a smaller model.

I trusted my config for an hour before realizing the model was reloading every turn. OpenClaw took 155 seconds to answer a prompt that raw Ollama handled in three. Everything looked right except the context window, which the model silently truncated to 3,000 tokens.

Issue #10974 has the receipts. You only hit the bug once you trust the config.

OpenClaw Ollama configuration: the five values that matter

This is the recommended-path setup, end to end. Get these five right and OpenClaw plus Ollama stays stable. Skip one and you will lose an evening on GitHub issues.

If you are also wiring skills and MCP servers into the same install, these are the foundation.

SettingValueWhere you set itBreaks if
Base URLhttp://localhost:11434Provider config in ~/.openclaw/openclaw.jsonYou add /v1. Tool calls become plain JSON text
API keyOLLAMA_API_KEY="ollama-local"Env var in .env or shellUnset. OpenClaw refuses to load the provider
Context length65536 (or higher)OLLAMA_CONTEXT_LENGTH env var, or PARAMETER num_ctx 65536 in a ModelfileManifest under 8192. OpenClaw rejects the model
Model identifierollama/<tag>Provider config model fieldBare name. Returns "Unknown model: llama3.2"
Gateway URLhttp://127.0.0.1:18789/<token>Token in ~/.openclaw/openclaw.jsonWrong token. 404 in the browser
The five OpenClaw + Ollama configuration values, at a glance

1. Base URL: 11434 without /v1

Ollama's service listens on http://localhost:11434 by default. OpenClaw's provider config wants exactly that, no path suffix. The trap is that Ollama also exposes an OpenAI-compatible API at /v1/chat/completions, and if you have been writing against the OpenAI SDK recently, your fingers will type /v1 on autopilot.

Do not. Tool calling on /v1 returns tool calls as plain JSON in the chat stream instead of structured calls OpenClaw can dispatch. Your agent looks like it works, replies come back, but skills and MCP servers sit idle while the model types out {"tool": "shell", "args": {...}} as message text.

Set the URL explicitly:

{
  "provider": "ollama",
  "baseURL": "http://localhost:11434",
  "model": "ollama/qwen2.5-coder:14b"
}

If you have been chasing missing tool dispatch on a config that otherwise looks correct, this is the first thing to check.

OpenClaw Ollama base URL comparison showing port 11434 working and /v1 path breaking tool calls

2. API key: a placeholder Ollama ignores

Ollama does not validate API keys on a local install, but OpenClaw still demands the env var exists. Issue #68344 confirmed it: the provider layer refused to load Ollama on macOS until OLLAMA_API_KEY was set, even though Ollama never checks the value. Pick any non-empty string:

export OLLAMA_API_KEY="ollama-local"

Drop the same line into your .env file if OpenClaw's CLI reads one on startup. The string never reaches Ollama. This is the cheapest config trap in the whole setup because the symptom, "OpenClaw will not load my provider", points nowhere near a key the runtime ignores.

3. OpenClaw context window: 64k or the model gets rejected

Ollama's default context window is 4,096 tokens, and it scales by VRAM: under 24 GiB you stay at 4k, between 24 and 48 GiB you get 32k, at 48 GiB and up you get 256k. None of those defaults are agentic-workflow-ready except the top tier.

OpenClaw documents 64,000 tokens as the minimum for agents, coding, and web search. Issue #68344 confirmed it: OpenClaw reads the model's manifest, not the runtime override, and rejects anything under 8,192. You will see Model context window too small (8192 tokens) even with OLLAMA_CONTEXT_LENGTH=65536 in your shell.

Two ways to fix it. Set the global env var:

export OLLAMA_CONTEXT_LENGTH=65536

Or build a per-model Modelfile, which writes the value into the manifest so OpenClaw sees it on models list:

FROM qwen2.5-coder:14b
PARAMETER num_ctx 65536
ollama create qwen2.5-coder-64k -f Modelfile

4. Model identifier: ollama/ prefix is mandatory

OpenClaw's model names are namespaced by provider. Ollama models live at ollama/<model-name>, not bare <model-name>. Bare names throw Error: Unknown model: llama3.2, and chat hangs on a permanent "thinking" indicator with no error in the UI. In openclaw.json:

"model": "ollama/qwen2.5-coder:14b"

The tag (:14b, :7b, :32b) matches what ollama list shows you locally. If you pulled qwen2.5-coder without a tag, Ollama defaults to latest. I burned twenty minutes on this on my first install, thinking OpenClaw did not see the model at all. The name was wrong by one prefix.

5. Gateway: port 18789 plus the token

OpenClaw's gateway is the web-chat and channel-routing service. It binds to http://127.0.0.1:18789 and protects the dashboard with a token appended to the URL path. The token lives in ~/.openclaw/openclaw.json:

cat ~/.openclaw/openclaw.json | grep token

Compose the dashboard URL by appending the token after a slash (http://127.0.0.1:18789/<your-token>). A curl against it should return the chat UI's HTML.

A 404 means your token does not match the one in the config file, usually because the gateway restarted and rotated the token. Re-read the config, paste the new value.

Hardware requirements for OpenClaw with Ollama

Before any of the values above matter, the host has to clear the floor: a recent Ollama, a recent Node, and a GPU with at least 4 GB of VRAM. For hardware comparisons, the Pi, Mac Mini, or VPS guide walks the tradeoffs; on a low-power Pi specifically, see the Raspberry Pi guide.

Verify the versions:

ollama --version    # must be 0.17.x or later
node --version      # must be v22.12.0 or later (20+ technically works)

Earlier Ollama versions do not ship the ollama launch openclaw flow, so you skip the wizard and configure by hand.

For RAM and VRAM, the floor is 16 GB system RAM plus 4 GB VRAM to run Qwen 2.5 or DeepSeek R1 with any real responsiveness. Below that, route to Ollama Cloud's free tier instead. The model-to-VRAM ladder:

ModelVRAMBest for
Phi-4 Mini~3 GBTiny GPU laptops, quick checks
Qwen 2.5 / DeepSeek R14 GB / ~10 GB testedDay-to-day agentic work on a 16 GB / 4 GB box
GPT-OSS 20B~12 GBMid-tier production agents
Llama 3.3 70B~40 GBSerious production on a workstation GPU
GPT-OSS 120B~48 GBFrontier-quality local work, RTX A6000 or similar
Local model VRAM requirements

Below the 4 GB VRAM floor, take the escape hatch: pick cloud + local at the provider step during onboarding and sign into Ollama Cloud in the browser. Kimi K2.5 runs from Ollama's hosted service for free and your wiring stays identical.

The shortcut path: ollama launch openclaw

On Ollama 0.17 and newer, a single command does most of the OpenClaw Ollama setup. It downloads OpenClaw, sets the Ollama provider in the config, picks a default model, and starts the gateway. For a greenfield setup or a demo, this is the right call: you are chatting with a local model in under two minutes.

ollama launch openclaw

The wizard handles the easy parts correctly: base URL, model registration, gateway port and token. It misses the OLLAMA_API_KEY env var (you will hit "provider failed to load" on the next restart), the context-length override (default is whatever Ollama set, usually 4k), and the /v1 confusion if you later swap providers.

Use the launch flow for greenfield or demos; for production, hand-roll the five values yourself.

The wizard handles the easy 80% and leaves you the parts that quietly break. OpenclawVPS provisions a managed VPS with OpenClaw, Ollama, and the five config values already set the way the production docs assume. Plans start at $19/month.

How do you verify OpenClaw Ollama is working?

Three checks verify the OpenClaw Ollama configuration is actually doing what you set it to do. A working setup is a context window that applies, tool calls that come back structured, and a gateway reachable on the right token. I run these after every Modelfile rebuild because a passing manifest does not always mean a passing runtime.

Check 1: Does the context window actually apply?

Hit Ollama's show endpoint and compare advertised vs. configured:

curl http://localhost:11434/api/show -d '{"model":"qwen2.5-coder:14b"}'

In the response, the parameters block should show num_ctx 65536. If it shows 8192, your Modelfile did not take.

If it shows 65536 but generation behavior suggests truncation, you are hitting ollama issue #10974: the runtime is ignoring the manifest under load. Restart Ollama and confirm with the API call again.

Check 2: Do tool calls come back as structured calls?

Send a prompt that requires a tool. Watch the chat. If a tool call executes and a result comes back, you are good.

If raw JSON appears in the chat, something like {"name": "shell", "arguments": {...}} typed out as message text, you are on /v1 and the path needs to come off.

Check 3: Is the gateway actually reachable?

curl http://127.0.0.1:18789/<token>

The response should be HTML for the chat UI. A 404 means the token does not match. Re-read ~/.openclaw/openclaw.json for the current value. Connection refused means the gateway is not running at all; restart it.

Terminal error reading Model context window too small 8192 tokens with arrow pointing to model manifest

Picking the right OpenClaw local model

The wizard's default may not fit your hardware or your use case, so pick deliberately. The right OpenClaw local model depends on three things: VRAM, response-time tolerance, and whether you are running personal agentic work or shipping to production.

I picked GPT-OSS 20B on my first run and watched it run cold on a 12 GB card before swapping back to Qwen 2.5 Coder 14B. The practical pick was right there in the table I had skipped.

For day-to-day agentic work on a mid-range GPU, Qwen 2.5 Coder 14B is the practitioner consensus. It ran at around 10 GB of VRAM on the box I tested, comfortably inside a 12 GB card.

A practitioner walkthrough on an RTX A6000 settled on GLM 4.7 flash for the same role. Both are competent at the multi-turn tool-using workload OpenClaw was built for.

For mid-tier production where you want frontier-adjacent quality, GPT-OSS 20B at ~12 GB VRAM is the practical pick. At the top, if you have 48 GB VRAM, GPT-OSS 120B is what people are actually shipping on.

On a laptop with no GPU, drop to Phi-4 Mini at ~3 GB VRAM or pick the cloud-plus-local hybrid: provider set to cloud + local, model set to Kimi K2.5 via Ollama Cloud's free tier. Wiring stays the same. The swap to a different model flow covers the runtime side without re-running the wizard.

The honest tradeoff: local models are slower and less nuanced than Claude or GPT on complex multi-step tasks. They struggle on long agentic chains. What they are better at is staying on your machine.

If privacy or cost is the reason you are here, the tradeoff is worth it. If you want frontier-quality reasoning and the privacy is incidental, DeepSeek setup via the API path is a different question, and add models walks the config for Claude, GPT, and the rest.

OpenClaw local model decision tree by GPU and RAM mapping Phi-4 Mini Qwen DeepSeek and GPT-OSS

OpenClaw Ollama configuration mistakes (and the fixes)

Six configuration mistakes account for nearly every failure I have seen on OpenClaw + Ollama. The table below pairs each symptom with its fix; the three most common get prose treatment below. The most common symptom by far is a 155-second response time that looks like a performance problem and is actually a context-window mistake.

SymptomLikely causeFix
Tool calls show as raw JSON in chatUsing /v1 URLDrop the /v1 suffix
"Provider failed to load" on startupMissing OLLAMA_API_KEYSet to any non-empty string
"Model context window too small (8192)"OpenClaw context window manifest under 8192Build Modelfile with num_ctx 65536, recreate model
"Unknown model: llama3.2"Bare model name in configPrefix with ollama/
155-second response timesSilent context truncation reloadFix context window, restart Ollama
Gateway dies mid-sessionKnown crash on long-running gatewayRestart, cron-job restart every 30 min
OpenClaw + Ollama configuration mistakes and fixes

Context-window failure is the most expensive to diagnose because it looks like a performance bug, not a config bug. A model that benchmarks at 3-second responses through ollama run takes 155 seconds through OpenClaw if the context window quietly truncates and reloads on every turn.

Issue #18249 documents this exact failure on qwen2.5-coder:14b. Fix is the Modelfile rebuild with PARAMETER num_ctx 65536 and a verification pass via /api/show.

Bare-model-name failure is the cheapest to fix and the most embarrassing. You will have qwen2.5-coder:14b working at the Ollama level (you can ollama run it, ollama list shows it) but OpenClaw cannot find it.

Three weeks in, you will glance at openclaw.json, see "model": "qwen2.5-coder:14b", and remember that OpenClaw namespaces models by provider. Prefix with ollama/, restart the gateway, done.

Gateway-crash failure is the operational reality of running this stack on a daily-driver machine. The gateway "crashes a lot." A cron job restarting it every 30 minutes is the field-tested patch. For a personal agent it is enough; past that stage, the gateway needs to live somewhere that is not your laptop.

Six failure modes and the one about the gateway dying every 30 minutes adds up fast. OpenclawVPS runs the same OpenClaw and Ollama setup on a VPS that does not sleep, with the gateway restart and config monitoring already handled. Sign-up to running gateway in 47 seconds.

When should you switch from local Ollama to a managed VPS?

Local stops making sense when you are restarting the gateway every half hour, your laptop has under 16 GB of RAM, or you need the agent reachable from your phone.

At that point the question stops being "how do I configure this correctly" and starts being "what do I want to manage."

Local does still make sense in the other direction: privacy is non-negotiable, you have the hardware, and you are comfortable with gateway restarts and the occasional Modelfile rebuild.

Reviewing OpenClaw security best practices is worth doing before you expose any of this to a phone. The gateway's token-on-URL auth pattern is fine on localhost but needs more care once it is reachable.

A managed VPS solves the management half. OpenclawVPS runs the same OpenClaw and Ollama integration on a host that does not sleep, with the gateway restart, config monitoring, and EU-region data residency already in place.

For the underlying mental model of what OpenClaw is and how it works, the parent guide covers the architecture before this article picks up at config.

The setup I run now

After working through the five values, the verification checks, and the six failure modes in 2026, my current setup runs on a managed VPS. Same OpenClaw, same Ollama, same provider config, but on a host that does not sleep.

Provisioning took 47 seconds, and the EU region keeps the data inside the GDPR perimeter.

Get started with OpenclawVPS →


Frequently asked questions

How do I set up OpenClaw with Ollama?
Two paths. The shortcut is ollama launch openclaw on Ollama 0.17 or newer, which auto-configures the provider and picks a default model. The manual path: install Ollama and OpenClaw separately, point OpenClaw at http://localhost:11434 (no /v1), set OLLAMA_API_KEY to any non-empty string, and set context length to 64,000 tokens. Shortcut for demos, manual for production.
What are the hardware requirements for running OpenClaw with Ollama?
Minimum 16 GB RAM and 4 GB VRAM to run a useful agentic model like Qwen 2.5 or DeepSeek R1 locally. Phi-4 Mini works on ~3 GB VRAM if you are below that. GPT-OSS 20B needs ~12 GB; GPT-OSS 120B needs ~48 GB. CPU-only setups run tiny models slowly enough that most people route through Ollama Cloud's free tier instead.
Why should I avoid the /v1 endpoint when configuring Ollama in OpenClaw?
OpenClaw Ollama tool calling depends on the native API path, not the /v1 OpenAI-compat layer. The /v1 path works for basic chat but returns tool calls as plain JSON text rather than structured calls OpenClaw can dispatch to skills or MCP servers. Use http://localhost:11434 with no path suffix.
Why does OpenClaw reject my Ollama model with "context window too small"?
OpenClaw reads the context window from the model's manifest, not the runtime config. If the manifest advertises under 8,192 tokens, OpenClaw refuses to load the model even when you have set OLLAMA_CONTEXT_LENGTH=65536. Fix: build a custom Modelfile with PARAMETER num_ctx 65536 and recreate the model so the manifest reflects the new value. Verify with curl http://localhost:11434/api/show and check the parameters block.

Keep reading