Model page

GPT-3.5 Turbo · 0125

Pinned January 2024 snapshot of GPT-3.5 Turbo.

About GPT-3.5 Turbo · 0125

When the priority is volume and budget over depth, GPT-3.5 Turbo 0125 earns its place. Priced at $0.50 per million input tokens — roughly 20x cheaper than GPT-4 Turbo — this final GPT-3.5 snapshot is built for applications that run at scale rather than wrestle with hard problems. The 0125 release cleaned up a meaningful UTF-8 encoding bug, making structured outputs like JSON and YAML consistently reliable across non-English languages, which is a genuine practical improvement developers have noticed. Users praise its streaming speed and dependable format compliance in chatbot pipelines; the frozen snapshot date also means behavior stays consistent across API calls, which matters for compliance-critical workflows. The honest trade-off: factual accuracy lags well behind GPT-4 variants, and with a September 2021 knowledge cutoff and a 16K context window, it struggles with recent information and longer documents. For straightforward classification, structured data generation, and high-throughput conversational tasks where cost discipline matters, it remains a focused and well-understood tool.

Best for

  • High-volume classification and tagging at scale where per-token cost is a primary constraint
  • Multilingual chatbots and customer service flows requiring fast streaming and reliable structured output
  • JSON, YAML, and XML generation for API integrations, especially in non-English language contexts
  • Simple summarization, text reformatting, and document condensing without complex reasoning
  • Compliance-sensitive pipelines that need frozen, predictable model behavior across repeated API calls

Specs & capabilities

How GPT-3.5 Turbo · 0125 stacks up — intelligence, speed, context, and modalities.

Capability

Intelligence

Low

Capability

Speed

Slow

Capability

Context window

16,385 tokens

Capability

Max output

4,096 tokens

Capability

Knowledge cutoff

September 1, 2021

API

Supported endpoints

v1/chat/completions · v1/responses · v1/assistants · v1/batch · v1/fine-tuning

Modalities

Input and output

Input: Text
Output: Text

Features

Availability notes

Fine-tuning supported

Frequently asked questions

How much does GPT-3.5 Turbo 0125 cost?

$0.50 per million input tokens and $1.50 per million output tokens — approximately 20x cheaper than GPT-4 Turbo.

What is the context window?

16,385 input tokens with a maximum of 4,096 output tokens. This is significantly smaller than GPT-4 Turbo's 128K window, so long documents may need chunking.

What did the 0125 update actually fix?

It corrected a UTF-8 encoding bug that caused errors in non-English function calls, improving structured output reliability across multiple languages. It also improved format-following accuracy.

What is GPT-3.5 Turbo 0125 not well suited for?

Multi-step reasoning, complex instruction-following, image analysis, and tasks requiring up-to-date knowledge. Its knowledge cutoff is September 2021, and it hallucinates more frequently than GPT-4 variants.

How does it compare to GPT-4o mini?

OpenAI itself now recommends GPT-4o mini as a more capable and cost-effective replacement. GPT-4o mini offers better reasoning and accuracy at a comparable price point. GPT-3.5 Turbo 0125 is the final snapshot of its generation.

Is this model still being updated?

No. The '0125' suffix reflects a frozen snapshot from January 25, 2024. This is the last major release in the GPT-3.5 Turbo line, and it will not receive further updates.

Related models