NVIDIABase plan

Nemotron 3 Ultra

NVIDIA's Nemotron 3 Ultra (550B-A55B) via the Vercel AI Gateway: an open mixture-of-experts model (550B total, 55B active) for high-throughput reasoning, agentic tool use, and long-horizon work with a 1M-token context window. Function calling supported. Available serverless on just4o.chat at the base tier. Uses 2 base requests per send before length multipliers. Does not support web search or image input.

You always get the exact model you pick — we never silently route you to another.

Specifications

ProviderNVIDIA
Released2026-06
IntelligenceHigh
SpeedFast
Context window1,000,000 tokens
Knowledge cutoffNot officially published
Input price$0.37 / 1M tokens
Output price$1.08 / 1M tokens
Request cost2 base requests
Plan tierBase
InputText
OutputText
FeaturesCached input: $0.12 / 1M tokens, 2 base requests per send before length multipliers, Function calling supported, Served through the Vercel AI Gateway (nvidia/nemotron-3-ultra-550b-a55b)
Model IDnemotron-3-ultra