Model page

GPT-5.4 nano

Cheapest GPT-5.4-class model for simple high-volume tasks such as extraction, ranking, and lightweight subagents.

About GPT-5.4 nano

At roughly $0.18 per million tokens blended, GPT-5.4 nano occupies a deliberate efficiency tier in OpenAI's GPT-5.4 family — not a stripped-down mini, but a purpose-built layer for systems that need speed and volume rather than depth. It runs at 146.5 tokens per second, well above the peer median, and its MATH-500 score of 88.6% makes it surprisingly capable for calculation-heavy pipelines. Developers building classification engines, extraction jobs, and routing layers consistently cite cost and throughput as the main reasons to reach for nano over its larger siblings. That said, it is not a generalist. Users have found real gaps: XML and LaTeX structured output degrades sharply compared to JSON, spatial reasoning tasks expose clear limits, and its 6-second time-to-first-token is notably higher than comparable models. For agentic workflows requiring complex image interpretation or multi-step reasoning, the larger GPT-5.4 mini will serve better. Nano earns its place as a high-volume component in layered systems, not a sole workhorse.

Best for

  • Data extraction and document parsing at scale, where per-token cost and throughput matter more than nuanced comprehension
  • Text classification and routing — email triage, ticket categorization, content moderation, and query labeling in high-frequency pipelines
  • JSON-structured output workflows, including form extraction and API-feeding tasks that depend on reliable structured generation
  • Lightweight coding subagents within larger orchestration systems where nano handles simple generation and refactoring passes cheaply
  • Math and calculation-heavy tasks where its 88.6% MATH-500 performance adds genuine value without the cost of a larger model

Specs & capabilities

How GPT-5.4 nano stacks up — intelligence, speed, context, and modalities.

Capability

Intelligence

Medium

Capability

Speed

Medium

Capability

Context window

400,000 tokens

Capability

Max output

128,000 tokens

Capability

Knowledge cutoff

August 31, 2025

API

Supported endpoints

v1/chat/completions · v1/responses · v1/realtime · v1/assistants · v1/batch

Modalities

Input and output

Input: Text, Image
Output: Text

Features

Availability notes

Cached input: $0.02 / 1M tokens · Web search, file search, image generation, code interpreter, hosted shell, apply patch, skills, and MCP supported · Computer use and tool search are not supported · Fine-tuning not supported; distillation supported

Frequently asked questions

What does GPT-5.4 nano cost?

Input is $0.20 per million tokens and output is $1.25 per million tokens, putting the blended effective cost at around $0.18 per million tokens — the cheapest option in the GPT-5.4 family.

What is the context window?

400,000 tokens input with a maximum of 128,000 tokens output.

What is nano best at?

High-volume classification, data extraction, JSON-structured output, ranking, and serving as a fast, cost-efficient subagent layer inside larger systems. It also holds up well on mathematical tasks.

Where does nano fall short?

Spatial and 3D reasoning, complex agentic computer-use tasks (39% on OSWorld-Verified), XML and LaTeX structured output, and competitive math olympiad problems like AIME where scores sit at 26–30%.

How does nano compare to GPT-5.4 mini?

Nano is cheaper and faster in tokens-per-second throughput, but mini handles complex generation, deeper image understanding, and harder reasoning tasks more reliably. Nano is designed as an efficiency tier within a multi-model system, not a replacement for mini.

Is GPT-5.4 nano available in ChatGPT?

No. It is API-only and is not accessible through ChatGPT consumer products. Using it requires direct developer integration.

Related models