Model page

GPT-3.5 Turbo

Legacy GPT model for cheaper chat and non-chat tasks.

About GPT-3.5 Turbo

Before GPT-4o mini existed, GPT-3.5 Turbo was the go-to choice for developers who needed fast, affordable chat at scale — and for many legacy systems, it still fills that role. At $0.50 per million input tokens, it remains one of the cheaper options for high-volume, simple conversational tasks where raw reasoning depth is not required. Developers who have put it to use most appreciate its speed for real-time interactions and, notably, its fine-tuning support: OpenAI has demonstrated that a well-tuned GPT-3.5 Turbo can match base GPT-4 on narrow, domain-specific tasks. That said, it carries real limitations that are hard to overlook. Its knowledge cuts off at September 2021, meaning anything recent is simply out of scope. It has no vision, audio, or multimodal capability. And on complex reasoning, it trails newer models by a significant margin. OpenAI now officially recommends GPT-4o mini as the preferred upgrade. GPT-3.5 Turbo is best understood as a proven workhorse for straightforward workloads and existing integrations — not the right tool for tasks that demand depth.

Best for

  • High-volume production chat at low cost, where speed and affordability matter more than advanced reasoning
  • Fine-tuning base for narrow, domain-specific tasks where a customized model can punch above its weight class
  • Maintaining legacy applications and infrastructure already built around GPT-3.5 Turbo's consistent behavior
  • Simple, templated conversation flows, short-form completions, and structured query handling
  • Budget-constrained experiments or prototypes before committing to a more capable (and more expensive) model

Specs & capabilities

How GPT-3.5 Turbo stacks up — intelligence, speed, context, and modalities.

Capability

Intelligence

Low

Capability

Speed

Slow

Capability

Context window

16,385 tokens

Capability

Max output

4,096 tokens

Capability

Knowledge cutoff

September 1, 2021

API

Supported endpoints

v1/chat/completions · v1/responses · v1/assistants · v1/batch · v1/fine-tuning

Modalities

Input and output

Input: Text
Output: Text

Features

Availability notes

Fine-tuning supported

Frequently asked questions

How much does GPT-3.5 Turbo cost?

$0.50 per million input tokens and $1.50 per million output tokens — among the lower-priced options available for established OpenAI models.

What is the context window?

16,385 tokens, with a maximum output of 4,096 tokens per response.

What is the knowledge cutoff?

September 2021. The model has no awareness of events, releases, or developments after that date, which is a meaningful gap for anything time-sensitive.

Does it support vision or multimodal input?

No. GPT-3.5 Turbo handles text only — no image, audio, or document input is supported.

How does it compare to GPT-4o mini?

OpenAI officially recommends GPT-4o mini as the successor. GPT-4o mini offers better reasoning, multimodal support, a more recent knowledge cutoff, and lower pricing. GPT-3.5 Turbo's main advantage is compatibility with existing integrations.

Who should still use GPT-3.5 Turbo?

Teams with production systems already built around it, or those who have invested in fine-tuning a customized version for a specific task. New projects are generally better served by GPT-4o mini.

Related models