Question 1

How much does GPT audio mini cost?

Accepted Answer

$0.60 per million input tokens and $2.40 per million output tokens — roughly 10x cheaper than the full gpt-audio model.

Question 2

What is the context window?

Accepted Answer

128,000 tokens, with a maximum of 16,384 output tokens per request.

Question 3

What modalities does it support?

Accepted Answer

Text and audio for both input and output. It does not support image or video inputs, and does not offer streaming, structured outputs, fine-tuning, or predicted outputs.

Question 4

Is this the right model for real-time voice conversations?

Accepted Answer

No. GPT audio mini targets asynchronous batch workflows. For low-latency, real-time voice applications, OpenAI's gpt-realtime-mini is the more appropriate choice.

Question 5

What is its knowledge cutoff?

Accepted Answer

October 1, 2023. Applications requiring current information must implement retrieval augmentation — the model cannot reason about events after that date on its own.

Question 6

How does the December 2025 snapshot relate to the base gpt-audio-mini slug?

Accepted Answer

The gpt-audio-mini alias automatically points to the 2025-12-15 snapshot, which is the current recommended version. This snapshot delivers the hallucination and word error rate improvements noted above, though it also introduced some tone and style adherence regressions worth testing against your prompts.

Provider	OpenAI
Released	2025-12
Context window	128,000 tokens
Max output	16,384 tokens
Knowledge cutoff	October 1, 2023
Input price	$0.60 / 1M tokens
Output price	$2.40 / 1M tokens
Request cost	3 base requests
Plan tier	Base
Model ID	gpt-audio-mini-2025-12-15

GPT audio mini · Dec. 2025

About GPT audio mini · Dec. 2025

Best for

Specifications

Frequently asked questions

Related models