Question 1

What does the reasoning variant do differently from standard Grok-4.20?

Accepted Answer

It adds extended thinking: the model works through a reasoning trace before responding, improving accuracy on complex problems at the cost of higher token usage and longer time-to-first-token (around 10 seconds).

Question 2

How much does it cost?

Accepted Answer

xAI lists $1.25 per million input tokens and $2.50 per million output tokens, with cached input at $0.20/M. Because reasoning traces add output tokens automatically, real costs on hard problems run higher than the headline rate suggests.

Question 3

What is the context window?

Accepted Answer

2 million tokens for input. Maximum output per query is capped at 131,000 tokens in practice, even though the model spec lists up to 2M output.

Question 4

Is it fast for a reasoning model?

Accepted Answer

Yes — 170 to 197 tokens per second, roughly three times the median speed of comparable reasoning models benchmarked by Artificial Analysis.

Question 5

What should I NOT use it for?

Accepted Answer

Simple, low-stakes queries where reasoning overhead is wasteful — you can't dial down thinking depth, so routine tasks will cost more than they need to. Its verbosity also makes it a poor fit when concise output matters.

Question 6

How does it compare to the non-reasoning Grok-4.20?

Accepted Answer

The reasoning variant adds the extended thinking layer and carries higher latency and cost. Choose it when accuracy on hard problems is the priority; use the non-reasoning variant when speed and cost efficiency matter more.

Grok-4.20 Reasoning

About Grok-4.20 Reasoning

Best for

Specs & capabilities

Intelligence

Speed

Context window

Max output

Knowledge cutoff

Supported endpoints

Input and output

Availability notes

Frequently asked questions

Related models