GPT-5.3-Codex
The most capable agentic coding model to date. Optimized for agentic coding tasks in Codex or similar environments. 400K context, 128K max output. Reasoning off by default.
About GPT-5.3-Codex
Built to write, debug, and ship code autonomously, GPT-5.3-Codex is OpenAI's purpose-built agentic coding model — the one the Codex team itself used to manage deployment and diagnose test failures while training it. Where it earns loyalty is in sustained, multi-step work: users who push it through repository searches, PR creation, debugging sessions, and DevOps tasks report it "feels smarter" and becomes their default coding tool. It runs 25% faster than its predecessor, scores 57% on SWE-Bench Pro, and costs roughly half the output price of GPT-5.5 at $14.00 per million output tokens. The tradeoff is real, though: first-token latency sits around 80 seconds — far above the norm for reasoning models — and the model can go off-script, making autonomous decisions that ignore architectural guidelines. It is a specialist, not a generalist, and as of June 2026, OpenAI has announced its sunset from Codex subscription offerings, so availability may be limited.
Best for
- Agentic coding workflows that span multiple steps — repository search, terminal commands, debugging, and PR creation in a single session
- Interactive development where you steer the model mid-execution and keep it on track without losing context
- Code review, refactoring, and optimization across languages where detailed, verbose reasoning is welcome
- DevOps and infrastructure tasks requiring system-level command execution and deployment automation
- Long-running research and development sessions involving complex tool-call sequences
Specs & capabilities
How GPT-5.3-Codex stacks up — intelligence, speed, context, and modalities.
Intelligence
High
Speed
Medium
Context window
400,000 tokens
Max output
128,000 tokens
Knowledge cutoff
August 31, 2025
Frequently asked questions
What does GPT-5.3-Codex cost?
Input is $1.75 per million tokens, output is $14.00 per million tokens, and cached input drops to $0.175 per million tokens. That makes it roughly 2x cheaper on output than GPT-5.5.
How large is the context window?
400,000 tokens, with a maximum output of 128,000 tokens. OpenAI engineered a 'Perfect Recall' mechanism to reduce information degradation across that extended context.
What is it best at?
Agentic coding: multi-step tasks that combine code generation, tool use, terminal access, and decision-making. It scored 57% on SWE-Bench Pro and 64.7% on OSWorld-Verified.
What are its main weaknesses?
Time-to-first-token averages around 80 seconds, far above the roughly 2.75-second median for comparable reasoning models. It can also make autonomous decisions outside the scope of what you asked, and its verbosity raises costs.
How does it compare to GPT-5.5?
GPT-5.3-Codex is purpose-built for coding and agentic tasks and costs significantly less on output. GPT-5.5 is the stronger choice for general-purpose reasoning and broader applications.
Is GPT-5.3-Codex still available?
As of June 2026, OpenAI has announced the sunset of GPT-5.3-Codex from Codex subscription offerings, so access may be limited or changing. Check OpenAI's current model availability for the latest status.