Model page

Claude Opus 4.6

Anthropic's most advanced Claude model. Exceptional emotional intelligence and warmth paired with adaptive thinking that scales to the complexity of your request.

About Claude Opus 4.6

Opus 4.6 is the model researchers and engineers reach for when the problem genuinely cannot be chunked — loading an entire codebase, a year's worth of literature, or a complex multi-part investigation into a single session of up to 750,000 words. It tops Terminal-Bench 2.0 among frontier models for agentic coding tasks and leads BrowseComp for hard-to-locate information retrieval, reflecting a design philosophy built around sustained, autonomous work rather than quick exchanges. Scientists have noted roughly double the accuracy on computational biology and structural chemistry tasks versus its predecessor. The tradeoff is speed: at 38.8 tokens per second, it feels noticeably slower than alternatives during interactive back-and-forth. The 1M-token window is also still in beta, and users report meaningful performance degradation well before hitting its ceiling. Best suited to high-stakes tasks where depth matters more than pace.

Best for

  • Large codebase analysis and refactoring — load entire repositories for holistic understanding across a single session
  • Scientific research synthesis — ~2x improvement over Opus 4.5 on biology, chemistry, and phylogenetics evaluations
  • Agentic and autonomous workflows — leads Terminal-Bench 2.0 for agentic coding with support for parallel agent teams
  • Hard-to-find information retrieval — best-in-class BrowseComp score for tracking down obscure or scattered information
  • Complex multi-step reasoning — adaptive thinking with configurable effort levels for nuanced problems that require sustained logic

Specs & capabilities

How Claude Opus 4.6 stacks up — intelligence, speed, context, and modalities.

Capability

Intelligence

High

Capability

Speed

Slow

Capability

Context window

1,000,000 tokens

Capability

Max output

128,000 tokens

Capability

Knowledge cutoff

August 2025

Frequently asked questions

What does it cost?

Standard context (up to 200K tokens) costs $5.00 per million input tokens and $25.00 per million output tokens. Using the extended 1M-token window doubles the input price to $10.00 per million, with output at $37.50 per million. Prompt caching brings reused-context costs down significantly.

How large is the context window?

Up to 1 million tokens in beta — roughly 750,000 words per session. The standard, non-beta tier supports 200K tokens. Maximum output per request is 128K tokens.

What is it best at?

Deep research workflows, large-codebase reasoning, and agentic task execution. It holds the top score on Terminal-Bench 2.0 (agentic coding), BrowseComp (information retrieval), and GPQA Diamond (91.3% on PhD-level science questions).

Are there any notable weaknesses?

Output speed sits at 38.8 tokens per second, well below the frontier median of 62.3 t/s, which makes it feel slow for interactive conversations. Users have also reported that context quality degrades noticeably around 20-40% of the 1M-token window, not just at the ceiling.

How does it compare to Claude Sonnet 4.6?

Opus 4.6 carries a higher intelligence index and leads on complex reasoning and agentic benchmarks, but costs significantly more and generates output more slowly. Sonnet 4.6 is the better fit for everyday tasks and higher-volume workflows where speed and cost matter.

What is the knowledge cutoff?

Training data runs through May 2025, with reliable knowledge up to August 2025. Events between May 2025 and the February 2026 release date fall outside what the model can speak to with confidence.

Related models