GPT-4.1
GPT-4 refinement designed for coding with broad tool compatibility.
About GPT-4.1
GPT-4.1 was built with developers in mind, and it shows. Where GPT-4o struggled with partial code edits — hitting 18.3% accuracy — GPT-4.1 reaches 52.9%, and its SWE-Bench Verified score of 54.6% represents a 21-point jump over its predecessor. Paired with a 1 million token context window, it can hold entire codebases in a single conversation without losing the thread across files or modules. Developers report needing fewer re-prompts, less manual correction, and meaningfully tighter instruction-following in real agentic workflows. The speed story is also real: output runs at roughly 130 tokens per second — more than double the field median — with latency under 0.8 seconds. At $2 input / $8 output per million tokens, it lands noticeably cheaper than previous-gen equivalents. The honest caveat: hallucinations remain a documented limitation, and like prior GPT models it can fabricate sources. Also worth knowing — the full 1M context is API-only; the ChatGPT web interface caps at 32K tokens. For teams building code agents or processing large technical documents, GPT-4.1 is a meaningful step up in both capability and economics.
Best for
- Software engineering agents — resolving GitHub issues, refactoring, and generating optimized code across multiple languages
- Long-context document analysis — processing full codebases, lengthy specs, or multi-document research without chunking
- Rapid prototyping — cost-effective option for developers adding AI-powered features to their products
- Enterprise automation workflows — autonomous task execution in operations and engineering pipelines
- Multilingual coding — debugging and generating code across programming languages with strong instruction adherence
Specs & capabilities
How GPT-4.1 stacks up — intelligence, speed, context, and modalities.
Intelligence
Low
Speed
Medium
Context window
1,000,000 tokens
Max output
32,768 tokens
Knowledge cutoff
May 2024
Frequently asked questions
What does GPT-4.1 cost?
$2.00 per million input tokens and $8.00 per million output tokens via the OpenAI API.
How large is the context window?
1 million tokens (1,047,576) via the API. Note: the ChatGPT web interface is capped at 32,000 tokens, so the full window is only available programmatically.
How does it compare to GPT-4o for coding?
It scores 54.6% on SWE-Bench Verified versus GPT-4o's 33.2% — a 21-percentage-point improvement — and hits 52.9% accuracy on partial code edits compared to GPT-4o's 18.3%.
What are its main limitations?
Hallucinations and source fabrication remain documented issues. Some users also report inconsistent latency under heavy workloads.
Who should choose GPT-4.1 over GPT-4.1 mini?
Teams that need maximum coding accuracy and long-context reasoning at scale. GPT-4.1 mini offers a balanced trade-off for lighter workloads at lower cost.
What is the knowledge cutoff?
Approximately May 2024, with some sources indicating June 2024 for certain variants in the family.