GPT-4o · May 2024
May 2024 checkpoint of gpt-4o for that special voice.
About GPT-4o · May 2024
Speed and native multimodality define the May 2024 launch of GPT-4o — the original "omni" checkpoint that set a new bar for how fast a frontier model could respond without sacrificing versatility. Running a single unified neural network across text, images, audio, and video (no separate processors bolted together), it delivers around 100 tokens per second, well above the field median of 61. Users consistently praise its natural, conversational warmth and its strong image understanding, making it a go-to for real-time chat, visual document analysis, and multilingual workflows. That said, raw reasoning is not its strength: an AIME score of just 12% reflects meaningful limits on complex math and logic, and its Intelligence Index sits below the median for non-reasoning models. This is a checkpoint built for throughput and multimodal coverage — for teams who need reliable, expressive, fast responses rather than deep step-by-step reasoning. Note: this is the original May 13, 2024 release; later dated variants (August and November 2024) include cumulative improvements.
Best for
- Real-time chat and conversational applications where low latency and a warm, natural tone matter
- Visual tasks including image analysis, OCR, and scientific or document image interpretation
- High-volume production chatbots where throughput and speed outweigh the need for deep reasoning
- Multilingual and translation-heavy workflows, benefiting from improved non-English performance
- Multimodal content pipelines that mix text, images, and audio in a single API call
Specs & capabilities
How GPT-4o · May 2024 stacks up — intelligence, speed, context, and modalities.
Intelligence
Low
Speed
Medium
Context window
128,000 tokens
Max output
4,100 tokens
Knowledge cutoff
October 2023
GPT‑4o retirement (ChatGPT)
OpenAI retired GPT‑4o inside ChatGPT on February 13, 2026. It remains available through the OpenAI API.
Frequently asked questions
What does the -2024-05-13 suffix mean?
It pins this request to the original May 13, 2024 release of GPT-4o. Later checkpoints like -2024-08-06 and -2024-11-20 include cumulative changes; using this ID guarantees the exact original version.
What is the context window?
128,000 tokens of input context, with a maximum output of 4,100 tokens per response.
What does it cost?
At launch it was $5.00 per million input tokens and $15.00 per million output tokens. OpenAI dropped those prices 50% in October 2024 to $2.50 input / $10.00 output. Check current provider pricing, as rates may have changed since.
Is this a reasoning model like o1 or o3?
No. Despite the shared 'o' in the name, GPT-4o is a speed- and multimodal-focused model, not a chain-of-thought reasoning model. For complex math or multi-step logic, OpenAI's o-series models are better suited.
How does it compare to GPT-4 Turbo?
GPT-4o is roughly 2x faster than GPT-4 Turbo and supports higher rate limits, while maintaining comparable intelligence. It also adds native image, audio, and video inputs through a unified architecture rather than separate processors.
What is its knowledge cutoff?
October 2023, which is earlier than GPT-4 Turbo's April 2024 cutoff. Queries about events after that date will be outside its training data.