Question 1

What does it cost?

Accepted Answer

Input is $3.00 per million tokens, output is $15.00 per million tokens. Prompt cache hits drop to $0.30 per million tokens — a 90% discount — with a 1-hour TTL option. Blended real-world cost runs around $2.31 per million tokens.

Question 2

How large is the context window?

Accepted Answer

1 million tokens in beta on the Claude API and in Claude Code on Pro, Max, Team, or Enterprise plans. Claude.ai paid plans have a 500K token cap. Max output per response is 64,000 tokens.

Question 3

How does it compare to Claude Opus 4.6?

Accepted Answer

On coding (SWE-bench: 79.6% vs 80.8%) and computer use (OSWorld: 72.5% vs 72.7%) the two models are nearly identical. The gap opens on deep scientific reasoning — GPQA Diamond is 74.1% for Sonnet versus 91.3% for Opus. Sonnet costs roughly 1.7 to 5 times less depending on I/O mix.

Question 4

What is it genuinely weak at?

Accepted Answer

Long-chain reasoning that requires sustained depth is inconsistent compared to Opus 4.6. Context retrieval also degrades beyond ~700K tokens on adversarial tests. Extended thinking can be cost-inefficient, with diminishing returns past about 12–16K thinking tokens for most tasks.

Question 5

Who should choose Sonnet 4.6 over a cheaper or pricier model?

Accepted Answer

Teams running agentic workflows, automated code pipelines, or high-volume document processing where Opus quality isn't needed but Haiku throughput isn't enough. It is the default model for free and pro users on claude.ai, which signals Anthropic's view of its general-purpose value.

Question 6

When was it released and what is the knowledge cutoff?

Accepted Answer

Released February 17, 2026. Training data has a knowledge cutoff of April 2025.

Claude Sonnet 4.6

About Claude Sonnet 4.6

Best for

Specs & capabilities

Intelligence

Speed

Context window

Max output

Knowledge cutoff

Frequently asked questions

Related models