GPT-3.5 Turbo
Legacy GPT model for cheaper chat and non-chat tasks.
About GPT-3.5 Turbo
Before GPT-4o mini existed, GPT-3.5 Turbo was the go-to choice for developers who needed fast, affordable chat at scale — and for many legacy systems, it still fills that role. At $0.50 per million input tokens, it remains one of the cheaper options for high-volume, simple conversational tasks where raw reasoning depth is not required. Developers who have put it to use most appreciate its speed for real-time interactions and, notably, its fine-tuning support: OpenAI has demonstrated that a well-tuned GPT-3.5 Turbo can match base GPT-4 on narrow, domain-specific tasks. That said, it carries real limitations that are hard to overlook. Its knowledge cuts off at September 2021, meaning anything recent is simply out of scope. It has no vision, audio, or multimodal capability. And on complex reasoning, it trails newer models by a significant margin. OpenAI now officially recommends GPT-4o mini as the preferred upgrade. GPT-3.5 Turbo is best understood as a proven workhorse for straightforward workloads and existing integrations — not the right tool for tasks that demand depth.
Best for
- High-volume production chat at low cost, where speed and affordability matter more than advanced reasoning
- Fine-tuning base for narrow, domain-specific tasks where a customized model can punch above its weight class
- Maintaining legacy applications and infrastructure already built around GPT-3.5 Turbo's consistent behavior
- Simple, templated conversation flows, short-form completions, and structured query handling
- Budget-constrained experiments or prototypes before committing to a more capable (and more expensive) model
Specs & capabilities
How GPT-3.5 Turbo stacks up — intelligence, speed, context, and modalities.
Intelligence
Low
Speed
Slow
Context window
16,385 tokens
Max output
4,096 tokens
Knowledge cutoff
September 1, 2021
Supported endpoints
v1/chat/completions · v1/responses · v1/assistants · v1/batch · v1/fine-tuning
Input and output
Input: Text
Output: Text
Availability notes
Fine-tuning supported
Frequently asked questions
How much does GPT-3.5 Turbo cost?
$0.50 per million input tokens and $1.50 per million output tokens — among the lower-priced options available for established OpenAI models.
What is the context window?
16,385 tokens, with a maximum output of 4,096 tokens per response.
What is the knowledge cutoff?
September 2021. The model has no awareness of events, releases, or developments after that date, which is a meaningful gap for anything time-sensitive.
Does it support vision or multimodal input?
No. GPT-3.5 Turbo handles text only — no image, audio, or document input is supported.
How does it compare to GPT-4o mini?
OpenAI officially recommends GPT-4o mini as the successor. GPT-4o mini offers better reasoning, multimodal support, a more recent knowledge cutoff, and lower pricing. GPT-3.5 Turbo's main advantage is compatibility with existing integrations.
Who should still use GPT-3.5 Turbo?
Teams with production systems already built around it, or those who have invested in fine-tuning a customized version for a specific task. New projects are generally better served by GPT-4o mini.