OpenClaw's biggest ongoing cost is AI API usage. Heavy users can spend $100-300/month on tokens alone. Here's how to cut that by up to 80% without significantly impacting quality.
Where Your Money Goes
Every message you send to OpenClaw triggers an API call to your AI provider. The cost depends on:
- Model choice — Opus costs ~10x more than Haiku per token
- Conversation length — longer context = more tokens per message
- Response length — verbose replies cost more
- Frequency — more messages = more API calls
- Skills — some skills make additional API calls
Strategy 1: Choose the Right Model
This is the single biggest cost lever:
| Model | Relative Cost | Best For |
|---|---|---|
| Claude Opus 4.6 | $$$$$ | Complex reasoning, important tasks |
| Claude Sonnet 4.5 | $$$ | Daily driver, great quality/cost balance |
| Claude Haiku 4.5 | $ | Quick responses, simple tasks |
| GPT-4.1 | $$$$ | Alternative to Opus |
| GPT-4.1 Mini | $$ | Budget-friendly alternative |
| Gemini 2.5 Flash | $ | Cheapest for high volume |
Recommendation: Use Sonnet 4.5 as your default. It's roughly 5x cheaper than Opus with 80-90% of the quality for most tasks.
To change your model:
openclaw config set ai.model "anthropic/claude-sonnet-4-5"
Strategy 2: Set Token Limits
By default, OpenClaw doesn't limit response length. A single verbose reply can cost 10x a concise one.
Set a maximum output token limit:
openclaw config set ai.maxTokens 1024
For most conversations, 1024 tokens (roughly 750 words) is plenty. You can always ask "elaborate" if you need more detail.
Strategy 3: Smart Model Routing
The most effective strategy: use cheap models for simple tasks and expensive models only when needed.
Some OpenClaw configurations support model routing rules:
- Simple questions → Haiku (cheapest)
- Standard conversations → Sonnet (balanced)
- Complex reasoning → Opus (best quality)
Check your OpenClaw version for smart routing support, or use a proxy like OpenRouter that offers automatic model selection.
Strategy 4: Manage Conversation Length
Every message in a conversation sends the entire chat history to the AI. A 50-message conversation sends 50 messages worth of tokens with each new request.
Tips:
- Start new conversations for new topics
- Use
/clearor/resetto start fresh - Avoid open-ended back-and-forth when a single detailed prompt would work
- Be specific in your first message to avoid follow-up clarification
Strategy 5: Reduce Proactive Messages
If OpenClaw is configured to proactively check things (news, prices, emails), each check costs tokens. Reduce frequency:
- Daily briefing: once per day, not every hour
- Price monitoring: check every 6 hours, not every 30 minutes
- Email triage: batch process twice daily instead of real-time
Strategy 6: Use Caching
If you're on OpenRouter or a proxy that supports prompt caching, enable it. Repeated context (system prompts, memory) gets cached and costs less on subsequent calls.
Strategy 7: Monitor Your Spending
Check your AI provider's dashboard regularly:
- Anthropic: console.anthropic.com → Usage
- OpenAI: platform.openai.com → Usage
- Google: console.cloud.google.com → Billing
Set spending alerts to avoid surprises.
Real Impact
A typical user switching from Opus to Sonnet and adding token limits:
| Before | After |
|---|---|
| Opus, no limits | Sonnet, 1024 token limit |
| ~$80/month | ~$15/month |
| Savings: 81% |
Quality difference for daily tasks? Barely noticeable.
The Cheapest Setup
For absolute minimum cost:
- Use Gemini 2.5 Flash (~$0.15 per million tokens)
- Set token limit to 512
- Reduce proactive features
- Start new conversations frequently
Estimated monthly cost: $1-3 for moderate use.
Get Started
Deploy OpenClaw on ClawTank and bring your own API key. You control your model choice and spending — ClawTank handles the infrastructure.
