Claude Sonnet 4.6
Anthropic's most capable Sonnet yet. Full upgrade across coding, long-context reasoning, agent planning, and design. 1M token context window in beta. Same pricing as Sonnet 4.5.
About Claude Sonnet 4.6
Sonnet 4.6 sits at the sweet spot where coding and agentic work get done without paying Opus prices. On SWE-bench Verified it scores 79.6% — within one point of Opus 4.6 (80.8%) — at roughly a third of the cost, which is why developers running automated pipelines tend to reach for it first. The self-correction training is the headline improvement: when a tool call fails, the model recognizes and recovers rather than cycling through the same error. Users also praise the 1M-token context window for swallowing entire codebases or large document sets in a single pass. The honest caveat is that this context window has edges — retrieval quality degrades on adversarial tests beyond about 700K tokens, so vector-based RAG is still the safer bet for critical long-context searches. Speed is also a known tension: at 44 tokens per second, it runs slower than the median for its tier, which can feel noticeable in real-time applications. Still, for teams that need high-quality code generation, browser automation, and multi-step agentic workflows without Opus-level spend, Sonnet 4.6 is the practical default.
Best for
- Agentic coding pipelines — orchestrating file operations and shell commands with built-in error recovery from failed tool calls
- Web and computer-use automation — multi-step browser tasks, form filling, and spreadsheet navigation
- Long-context document work — processing large codebases, legal contracts, or research paper collections up to ~600K tokens reliably
- Real-time customer support and live coding assistants where mid-tier speed is acceptable
- Multilingual document extraction — 96.4% field-level accuracy across 14 languages with structured JSON output
Specs & capabilities
How Claude Sonnet 4.6 stacks up — intelligence, speed, context, and modalities.
Intelligence
Medium
Speed
Slow
Context window
1,000,000 tokens
Max output
64,000 tokens
Knowledge cutoff
April 2025
Frequently asked questions
What does it cost?
Input is $3.00 per million tokens, output is $15.00 per million tokens. Prompt cache hits drop to $0.30 per million tokens — a 90% discount — with a 1-hour TTL option. Blended real-world cost runs around $2.31 per million tokens.
How large is the context window?
1 million tokens in beta on the Claude API and in Claude Code on Pro, Max, Team, or Enterprise plans. Claude.ai paid plans have a 500K token cap. Max output per response is 64,000 tokens.
How does it compare to Claude Opus 4.6?
On coding (SWE-bench: 79.6% vs 80.8%) and computer use (OSWorld: 72.5% vs 72.7%) the two models are nearly identical. The gap opens on deep scientific reasoning — GPQA Diamond is 74.1% for Sonnet versus 91.3% for Opus. Sonnet costs roughly 1.7 to 5 times less depending on I/O mix.
What is it genuinely weak at?
Long-chain reasoning that requires sustained depth is inconsistent compared to Opus 4.6. Context retrieval also degrades beyond ~700K tokens on adversarial tests. Extended thinking can be cost-inefficient, with diminishing returns past about 12–16K thinking tokens for most tasks.
Who should choose Sonnet 4.6 over a cheaper or pricier model?
Teams running agentic workflows, automated code pipelines, or high-volume document processing where Opus quality isn't needed but Haiku throughput isn't enough. It is the default model for free and pro users on claude.ai, which signals Anthropic's view of its general-purpose value.
When was it released and what is the knowledge cutoff?
Released February 17, 2026. Training data has a knowledge cutoff of April 2025.