NVIDIABase plan

Nemotron 3 Ultra

NVIDIA's Nemotron 3 Ultra (550B-A55B) via the Vercel AI Gateway: an open mixture-of-experts model (550B total, 55B active) for high-throughput reasoning, agentic tool use, and long-horizon work with a 1M-token context window. Function calling supported. Available serverless on just4o.chat at the base tier. Uses 2 base requests per send before length multipliers. Does not support web search or image input.

Use in chat Compare Pricing

You always get the exact model you pick — we never silently route you to another.

Specifications

Provider	NVIDIA
Released	2026-06
Intelligence	High
Speed	Fast
Context window	1,000,000 tokens
Knowledge cutoff	Not officially published
Input price	$0.37 / 1M tokens
Output price	$1.08 / 1M tokens
Request cost	2 base requests
Plan tier	Base
Input	Text
Output	Text
Features	Cached input: $0.12 / 1M tokens, 2 base requests per send before length multipliers, Function calling supported, Served through the Vercel AI Gateway (nvidia/nemotron-3-ultra-550b-a55b)
Model ID	nemotron-3-ultra