Model page

DeepSeek V4 Pro

DeepSeek-V4-Pro via Fireworks: flagship open-source 1.6T MoE model for frontier reasoning, advanced coding, and long-context agentic workflows. 1M context. Function calling supported. Uses 1 premium request per send before length multipliers. Does not support web search or image input.

About DeepSeek V4 Pro

DeepSeek V4 Pro makes a compelling case that frontier-class coding performance and a one-million-token context window do not have to cost frontier-class money. At roughly $0.18 per million tokens blended, it runs 10x cheaper on input and 30x cheaper on output than comparable models, while posting an 80.6% score on SWE-Bench Verified — the highest reported among open-weight models at launch. Users consistently praise its agentic coding ability, noting it competes with or beats larger closed models on multi-step coding tasks, and its hybrid attention architecture handles full-codebase analysis without collapsing under the token budget. The MIT license is a genuine differentiator: weights are freely available for self-hosting, fine-tuning, and commercial integration. The honest caveat: V4 Pro is verbose. It can generate four to five times more output tokens than comparable models on the same prompt, which erodes the per-token savings and makes cost estimation harder than it first appears. Still in preview as of mid-2026, with all benchmark scores currently vendor-reported, it is best suited for teams comfortable with that tradeoff.

Best for

  • High-volume automated coding pipelines, code review, and refactoring where per-call cost matters
  • Full-codebase analysis using 1M-token context for migration planning or architectural review
  • Multi-turn agentic workflows where reasoning must persist across tool calls and conversation turns
  • Long-document synthesis, research corpora summarization, and extraction from large structured data
  • On-premises or self-hosted deployment via MIT-licensed weights with custom fine-tuning

Specs & capabilities

How DeepSeek V4 Pro stacks up — intelligence, speed, context, and modalities.

Capability

Intelligence

High

Capability

Speed

Medium

Capability

Context window

1,048,600 tokens

Capability

Max output

384,000 tokens

Capability

Knowledge cutoff

April 2026

Modalities

Input and output

Input: Text
Output: Text

Features

Availability notes

Cached input: $0.14 / 1M tokens · 1 premium request per send before length multipliers · Function calling supported · Fine-tuning not supported on Fireworks serverless · On-demand Fireworks deployments available separately

Frequently asked questions

How much does DeepSeek V4 Pro cost?

Input is $0.435 per million tokens and output is $0.87 per million tokens, with a 99% discount on cache hits ($0.003625/M). The blended effective price is roughly $0.18/M tokens. DeepSeek made a 75% discount permanent in May 2026.

What is the context window?

1,048,576 tokens (approximately 1 million tokens), with a maximum output of 384,000 tokens.

Are the benchmark scores independently verified?

Not yet as of mid-2026. All published scores — including 80.6% SWE-Bench Verified and 90.1% GPQA Diamond — come from DeepSeek's internal evaluations. Independent third-party leaderboard verification is pending.

What is V4 Pro worst at?

It produces significantly more output tokens than most models on the same prompts — sometimes 4 to 5 times more — making real costs harder to predict. It also has a reliability gap versus top closed models on the most complex edge-case reasoning tasks, and it censors politically sensitive topics related to Chinese governance.

Who should pick DeepSeek V4 Pro over a proprietary frontier model?

Teams running cost-sensitive, high-volume coding or document workloads who can self-host or accept the API's privacy trade-offs, and who want MIT-licensed weights for fine-tuning or commercial integration.

Is V4 Pro stable or still in preview?

It remains a preview release as of June 2026. DeepSeek has not announced a stable release date.

Related models