Model page

o4 mini

Lean o-series model for high volume creative projects.

About o4 mini

Where o4-mini stood out was a rare combination: the math and coding chops of a serious reasoning model at roughly one-tenth the cost of its sibling o3. On the 2024 and 2025 AIME competitions it actually edged out o3 — 92.7% vs 88.9% on 2025 — while running at 186.5 tokens per second, making it one of the faster options in its class for high-volume workloads. It was also the first in OpenAI's o-series to accept image inputs, extending its reasoning to visual tasks like document analysis and design review. Users appreciated how far the budget stretched, especially for batch API automation where the 50% pricing discount made costs even easier to justify. The honest tradeoff: Artificial Analysis scored its general intelligence below the median for its tier, and a 22-second time-to-first-token makes it a poor fit for snappy interactive experiences. It has since been retired from ChatGPT (February 2026), with API access continuing until October 23, 2026 — migration to a successor is advisable for new projects.

Best for

  • High-volume batch processing and API automation where cost per token matters most
  • Math-intensive tasks — AIME-level problem solving, quantitative reasoning, and exam-style challenges
  • Coding assistance and code review, with strong SWE-bench (68.1%) and Codeforces ELO (2719) scores
  • Vision-augmented reasoning over documents, charts, or images, using its multimodal input support
  • Long-document analysis — 200k-token context handles entire codebases, research papers, or extended transcripts

Specs & capabilities

How o4 mini stacks up — intelligence, speed, context, and modalities.

Capability

Intelligence

Medium

Capability

Speed

Fast

Capability

Context window

200,000 tokens

Capability

Max output

100,000 tokens

Capability

Knowledge cutoff

June 1, 2024

Frequently asked questions

How much does o4-mini cost?

Input is $1.10 per million tokens and output is $4.40 per million tokens. A Batch API option cuts both prices by 50%.

What is the context window?

200,000 tokens input, with a maximum of 100,000 tokens of output per response.

How does it compare to o3?

o4-mini is about 10x cheaper than o3 and actually beats o3 on AIME math benchmarks, but scores below o3 on general intelligence indices and trails it on complex visual reasoning tasks like scientific figure interpretation.

Is o4-mini still available?

It was retired from ChatGPT on February 13, 2026. API access continues until October 23, 2026. For new projects, OpenAI recommends migrating to GPT-5.4 mini or o3-mini.

What does o4-mini struggle with?

General reasoning breadth, high-latency first response times (over 22 seconds), and verbose outputs that can be hard to skim. It also cannot be fine-tuned.

Who should choose o4-mini?

Teams running math, coding, or document-processing workloads at scale who need strong domain performance without o3's price tag — provided they can tolerate the model's sunset timeline and don't need sub-second responsiveness.

Related models