GPT-5.1 latest
GPT-5.1 model used in ChatGPT. Continuously updated for the latest chat improvements.
About GPT-5.1 latest
Where GPT-5.1 earns its keep is efficiency: it dynamically allocates compute based on query complexity, running simple tasks 2-3x faster than its predecessor by skipping unnecessary reasoning steps. The result is a conversational model with a noticeably warmer personality than GPT-5 and markedly improved instruction-following — users consistently report it feels more natural to talk to, and it handles document-heavy work like extraction, comparison, and code review with real precision. Extended prompt caching (up to 24 hours at a 90% discount) makes it cost-effective for workflows with repeated context. Benchmarks are solid but not top-of-class: AIME 2025 at 94% and a strong SWE-bench position, though it trails GPT-5 slightly on raw reasoning scores. One genuine frustration worth knowing: reasoning is disabled by default and must be manually enabled, and the model takes instructions literally — vague prompts produce literal outputs without self-correction. At $1.25 per million input tokens, it sits at a reasonable price point for teams that need reliable, fast, multimodal chat at scale.
Best for
- Document analysis workflows — extraction, summarization, comparison, and risk flagging across large document sets
- Code review and software engineering tasks, where its SWE-bench performance and instruction-following translate to practical output
- High-volume conversational applications where the 2-3x speed boost on simple queries keeps latency low
- Cost-sensitive API integrations that benefit from 24-hour prompt caching at a 90% input discount
- Image-grounded tasks such as document processing, equipment inspection, or visual content analysis
Specs & capabilities
How GPT-5.1 latest stacks up — intelligence, speed, context, and modalities.
Intelligence
High
Speed
Medium
Context window
128,000 tokens
Max output
16,384 tokens
Knowledge cutoff
September 30, 2024
Frequently asked questions
What does 'gpt-5.1-chat-latest' refer to?
It is the chat-optimized alias for GPT-5.1, a family OpenAI released in two waves in November 2025. The alias always points to the current chat variant; a separate Codex variant (gpt-5.1-codex) exists with a larger 400,000-token context window.
What is the context window and output limit?
The chat variant supports a 128,000-token context window with a maximum output of 16,384 tokens per response.
How is it priced?
Input is $1.25 per million tokens; output is $10.00 per million tokens. Cached input drops to $0.125 per million tokens — a 90% discount — with cache retention up to 24 hours.
Is GPT-5.1 a major capability upgrade over GPT-5?
Not in terms of raw capability. OpenAI characterizes it as an efficiency improvement: faster adaptive reasoning, better instruction-following, and a warmer conversational tone rather than a benchmark leap. GPT-5's AIME score (94.6%) actually edges out GPT-5.1's 94%.
What is the knowledge cutoff?
September 30, 2024 — roughly ten months before the model's November 2025 release, which means recent events may not be reflected in its responses.
Who should choose GPT-5.1 over other models?
Teams building document-heavy or conversational products that prioritize speed, cost efficiency via caching, and reliable instruction-following. If raw reasoning depth is the priority, enabling the reasoning_effort parameter or considering the Thinking variant is advisable, since reasoning is off by default in the chat build.