Question 1

What does Gemini 3.5 Flash cost?

Accepted Answer

Input is $1.50 per million tokens and output is $9.00 per million tokens. Cached input drops to $0.15 per million tokens — a 90% discount that makes repeated-context agent loops significantly cheaper.

Question 2

How large is the context window?

Accepted Answer

1,048,576 input tokens (roughly 1 million tokens), with a maximum output of 65,536 tokens (64K).

Question 3

How does it compare to Gemini 3.1 Pro?

Accepted Answer

Flash 3.5 actually beats Gemini 3.1 Pro on coding and agentic benchmarks — for example, 76.2% vs 70.3% on Terminal-Bench 2.1. Pro still leads by 3–8 points on academic reasoning tasks, deep analytical problems, and ultra-long context retrieval.

Question 4

What are the known weaknesses?

Accepted Answer

Time to first token averages about 19 seconds, which can feel slow in chat-style interactions. It also hits aggressive rate limits under heavy load, and it is not the right pick for deep reasoning, hard math, or precision-critical long-context retrieval.

Question 5

What modalities does it support?

Accepted Answer

It accepts text, image, audio, and video inputs. Output is text only — there is no image generation, audio generation, or Computer Use support.

Question 6

When should I pick Flash 3.5 over a cheaper Flash model?

Accepted Answer

When your workload involves agentic coding, multi-step tool use, or structured multimodal extraction. The 31-point hallucination reduction and superior agentic benchmark scores justify the higher cost over Gemini 3 Flash for reliability-sensitive production use cases.

Gemini 3.5 Flash

About Gemini 3.5 Flash

Best for

Specs & capabilities

Intelligence

Speed

Context window

Max output

Knowledge cutoff

Availability notes

Frequently asked questions