GPT-5 nano
Ultra light-touch assistant for simple interactions.
About GPT-5 nano
When the goal is to process millions of tokens at the lowest possible cost, GPT-5 nano delivers. At $0.05 per million input tokens — five times cheaper than GPT-5 mini — it is purpose-built for high-volume, latency-tolerant workloads: classifying support tickets, extracting structured fields from documents, routing messages, and running bulk summarization across 400K-token contexts without truncation. Developers running these pipelines appreciate the throughput — 163.8 tokens per second, above average for this tier — and the 80% prompt-caching discount that makes repeated-context jobs even cheaper. The tradeoff is real: time-to-first-token sits around 84 seconds, far above the median of roughly one second, so anything interactive is a poor fit. The model also tends toward verbosity, generating far more tokens than similarly-sized peers. If your workload is batch-oriented and the unit cost per operation matters more than conversational snappiness, nano earns its place. For tasks requiring nuanced reasoning or real-time responses, step up to GPT-5 mini.
Best for
- High-volume document classification and content labeling at scale
- Multi-document summarization using its full 400K-token context window
- Structured data extraction from invoices, contracts, and unstructured text
- Automated content routing and support-ticket triage
- Bulk translation and text transformation pipelines where cost per token is the primary constraint
Specs & capabilities
How GPT-5 nano stacks up — intelligence, speed, context, and modalities.
Intelligence
Low
Speed
Fast
Context window
400,000 tokens
Max output
128,000 tokens
Knowledge cutoff
May 30, 2024
Frequently asked questions
How much does GPT-5 nano cost?
Input is $0.05 per million tokens and output is $0.40 per million tokens. Cached input hits receive an 80% discount, dropping to $0.01 per million tokens — a significant saving for workflows that reuse long context repeatedly.
What is the context window?
GPT-5 nano supports up to 400,000 input tokens and can generate up to 128,000 output tokens, making it well suited for processing full documents or long conversation histories in a single call.
Is GPT-5 nano good for real-time chat applications?
No. Its time-to-first-token averages around 84 seconds, which is far above the median of about one second for comparable models. It is better suited to batch and asynchronous workloads than interactive, low-latency use cases.
How does it compare to GPT-5 mini?
GPT-5 nano is roughly 5x cheaper on blended pricing and faster in raw token throughput, but trails GPT-5 mini by around 10 percentage points on structured tasks and is meaningfully weaker at multi-step reasoning. Choose nano for bulk pipelines where cost dominates; choose mini when accuracy or reasoning depth matters more.
What is GPT-5 nano's knowledge cutoff?
Training data ends May 30, 2024. Events and developments from mid-2024 onward are outside its knowledge, which is worth considering for tasks that depend on current information.
Can I use GPT-5 nano in the ChatGPT web interface?
No. Unlike GPT-5 mini, nano is only accessible via the OpenAI API. It is not available in the ChatGPT consumer interface.