NVIDIABase plan
Nemotron 3 Ultra
NVIDIA's Nemotron 3 Ultra (550B-A55B) via the Vercel AI Gateway: an open mixture-of-experts model (550B total, 55B active) for high-throughput reasoning, agentic tool use, and long-horizon work with a 1M-token context window. Function calling supported. Available serverless on just4o.chat at the base tier. Uses 2 base requests per send before length multipliers. Does not support web search or image input.
You always get the exact model you pick — we never silently route you to another.
Specifications
| Provider | NVIDIA |
|---|---|
| Released | 2026-06 |
| Intelligence | High |
| Speed | Fast |
| Context window | 1,000,000 tokens |
| Knowledge cutoff | Not officially published |
| Input price | $0.37 / 1M tokens |
| Output price | $1.08 / 1M tokens |
| Request cost | 2 base requests |
| Plan tier | Base |
| Input | Text |
| Output | Text |
| Features | Cached input: $0.12 / 1M tokens, 2 base requests per send before length multipliers, Function calling supported, Served through the Vercel AI Gateway (nvidia/nemotron-3-ultra-550b-a55b) |
| Model ID | nemotron-3-ultra |