Models
Model providers have different strengths. Use the summaries below to choose a provider, then pick a model family that matches the task.
OpenAI (GPT)
Strong all-around for writing, planning, coding, and everyday questions. Includes GPT models from OpenAI plus GPT OSS variants served through Fireworks and Cerebras.
| Model | Requests | Best For |
|---|---|---|
| GPT-4o · May 2024 | 2 premium requests | May 2024 checkpoint of gpt-4o for that special voice. |
| GPT-4o · Aug. 2024 | 1 premium request | August 2024 checkpoint of gpt-4o with enhanced capabilities. |
| GPT-4o · Nov. 2024 | 1 premium request | November 2024 checkpoint of gpt-4o with latest improvements. |
| GPT-4o | 1 premium request | OpenAI's default gpt-4o through API. |
| GPT-4o mini | 1 base request | Agile, cost-efficient 4o variant ideal for everyday conversation. |
| Model | Requests | Best For |
|---|---|---|
| GPT-3.5 Turbo | 1 base request | Legacy GPT model for cheaper chat and non-chat tasks. |
| GPT-3.5 Turbo · 0125 | 1 base request | Pinned January 2024 snapshot of GPT-3.5 Turbo. |
| GPT-3.5 Turbo · 1106 | 1 base request | Pinned November 2023 snapshot of GPT-3.5 Turbo. |
| Model | Requests | Best For |
|---|---|---|
| GPT audio mini | 2 base requests | Cost-efficient audio-native chat model. Supports text + audio output in chat completions. |
| GPT audio mini · Oct. 2025 | 2 base requests | Pinned October 2025 snapshot of GPT Audio mini for stable behavior. |
| GPT audio mini · Dec. 2025 | 2 base requests | Pinned December 2025 snapshot of GPT Audio mini for stable behavior. |
| Model | Requests | Best For |
|---|---|---|
| GPT-5.4 | 2 premium requests | Latest GPT-5.4 flagship chat model with stronger reasoning and accuracy. |
| GPT-5.4 mini | 1 premium request | Higher-capability GPT-5.4 mini for high-volume coding, computer use, and subagent workflows. |
| GPT-5.4 nano | 1 base request | Cheapest GPT-5.4-class model for simple high-volume tasks such as extraction, ranking, and lightweight subagents. |
| Model | Requests | Best For |
|---|---|---|
| GPT-5.3-Codex | 1 premium request | The most capable agentic coding model to date. Optimized for agentic coding tasks in Codex or similar environments. 400K context, 128K max output. Reasoning off by default. |
| GPT-5.3 latest | 1 premium request | GPT-5.3 model used in ChatGPT. Best general-purpose model with high intelligence and vision support. Pricing assumed same as 5.2/5.1 chat latest until announced. |
| Model | Requests | Best For |
|---|---|---|
| GPT-5.2 latest | 1 premium request | GPT-5.2 model used in ChatGPT. Best general-purpose model with high intelligence and vision support. |
| GPT-5.2 | 1 premium request | Pinned GPT-5.2 snapshot for stable behavior. |
| Model | Requests | Best For |
|---|---|---|
| GPT-5.1 | 1 premium request | Pinned snapshot gpt-5.1-2025-11-13. The most intelligent model yet, with faster responses and increased steerability. |
| GPT-5.1 latest | 1 premium request | GPT-5.1 model used in ChatGPT. Continuously updated for the latest chat improvements. |
| GPT-5.1 codex | 1 premium request | GPT-5.1 optimized for agentic coding in Codex. 400K context, 128K max output. |
| GPT-5.1 codex mini | 1 base request | Smaller, more cost-effective, less-capable version of GPT-5.1-Codex. 400K context, 128K max output. |
| Model | Requests | Best For |
|---|---|---|
| GPT-5 | 1 premium request | Frontier reasoning depth with best-in-class reliability. |
| GPT-5 codex | 1 premium request | Enhanced code reasoning while staying conversation-friendly. |
| GPT-5 latest | 1 premium request | Continuously tuned GPT-5 chat experience with the latest guardrails. |
| GPT-5 mini | 1 base request | Responsive, budget-friendly member of the GPT-5 family. |
| GPT-5 nano | 1 base request | Ultra light-touch assistant for simple interactions. |
| Model | Requests | Best For |
|---|---|---|
| GPT-4.1 | 1 premium request | GPT-4 refinement designed for coding with broad tool compatibility. |
| GPT-4.1 mini | 1 base request | Compact GPT-4.1 option for consistent tone and speed. |
| GPT-4.1 nano | 1 base request | Minimal footprint 4.1 for background automation tasks. |
| Model | Requests | Best For |
|---|---|---|
| o3 | 1 premium request | Reasoning-focused o-series model optimized for long horizon tasks. |
| o4 mini | 1 premium request | Lean o-series model for high volume creative projects. |
| o3 mini | 1 premium request | Balanced o-series variant with emphasis on tool use during reasoning. |
| Model | Requests | Best For |
|---|---|---|
| GPT OSS 120b | 1 base request | OpenAI's open-weight 117B MoE via Fireworks. Production-grade reasoning, agentic tasks, function calling. 131k context. Does not support web search or image input. |
| GPT OSS 120B Fast | 1 premium request | OpenAI's GPT OSS 120B routed through Cerebras chat completions for very fast tool-capable replies. 131k context. Does not support web search or image input. |
| GPT OSS 20b | 1 base request | OpenAI's open-weight 21B MoE via Fireworks. Lower latency, local or specialized use-cases. 131k context. Does not support web search, image input, or function calling. |
Google (Gemini)
Great for long instructions, large context, and quick iteration on bigger tasks.
| Model | Requests | Best For |
|---|---|---|
| Gemini 2.5 Pro | 1 premium request | state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks. |
| Gemini 2.5 Flash | 1 base request | first hybrid reasoning model which supports a 1M token context window and has thinking budgets. |
| Gemini 2.5 Flash-Lite | 1 base request | smallest and most cost effective model, built for at scale usage. |
| Model | Requests | Best For |
|---|---|---|
| Gemini 3.1 Pro Preview | 3 premium requests | Next iteration of Gemini 3 Pro: performance, behavior, and intelligence improvements. 1M/64k context. Agentic workflows, autonomous coding, complex multimodal. Jan 2025. |
| Gemini 3.1 Flash-Lite | 1 base request | Stable Gemini 3.1 Flash-Lite model for high-volume agentic tasks, translation, and simple data processing. 1M/65k context. |
| Model | Requests | Best For |
|---|---|---|
| Gemini 3 Flash Preview | 1 base request | Preview of Gemini 3 Flash. 1M/64k context. Jan 2025. |
| Model | Requests | Best For |
|---|---|---|
| Gemini 3.5 Flash | 2 premium requests | Gemini 3.5 Flash combines frontier intelligence with fast responses, search grounding, and multimodal strengths. Uses 2 premium requests per send before length multipliers. |
Anthropic (Claude)
Good for careful writing, nuanced edits, and thoughtful longer responses.
| Model | Requests | Best For |
|---|---|---|
| Claude Opus 4.6 | 5 premium requests | Anthropic's most advanced Claude model. Exceptional emotional intelligence and warmth paired with adaptive thinking that scales to the complexity of your request. |
| Claude Opus 4.5 | 5 premium requests | Previous flagship Claude with strong reasoning capabilities. Great balance of intelligence and accessibility. |
| Claude Sonnet 4.6 | 5 premium requests | Anthropic's most capable Sonnet yet. Full upgrade across coding, long-context reasoning, agent planning, and design. 1M token context window in beta. Same pricing as Sonnet 4.5. |
| Claude Sonnet 4.5 | 5 premium requests | Anthropic's balanced Claude model with strong reasoning and efficiency. |
| Claude Haiku 4.5 | 1 premium request | Anthropic's fastest Claude model, optimized for speed and cost efficiency. |
xAI (Grok)
Good for quick back-and-forth, practical answers, and fast drafting.
| Model | Requests | Best For |
|---|---|---|
| Grok-3 Mini | 1 base request | Compact Grok-3 variant for cost-effective conversations. |
| Model | Requests | Best For |
|---|---|---|
| Grok-4.20 Reasoning | 1 premium request | xAI's flagship Grok 4.20 reasoning model with a 2M-token context window, stronger multi-step reasoning, and native tool support. |
| Grok-4.20 Non-Reasoning | 1 premium request | Latency-optimized Grok 4.20 variant with a 2M-token context window, image understanding, and native tool support. |
| Model | Requests | Best For |
|---|---|---|
| Grok 4.3 | 1 premium request | xAI's latest flagship Grok model with 1M context, image input, configurable reasoning, and strong agentic tool use. |
DeepSeek
Strong for reasoning and complex tasks. DeepSeek v3.x models and DeepSeek V4 Flash are available on all tiers; DeepSeek V4 Pro is Premium. V4 Flash and V4 Pro have 1M context and function calling. No web search or images.
| Model | Requests | Best For |
|---|---|---|
| DeepSeek V4 Flash | 1 base request | DeepSeek-V4-Flash via Fireworks: streamlined open-source MoE model optimized for fast, cost-efficient inference while preserving strong reasoning and coding performance at 1M context scale. Function calling supported. Uses 1 base request per send before length multipliers. Does not support web search or image input. |
| DeepSeek V4 Pro | 1 premium request | DeepSeek-V4-Pro via Fireworks: flagship open-source 1.6T MoE model for frontier reasoning, advanced coding, and long-context agentic workflows. 1M context. Function calling supported. Uses 1 premium request per send before length multipliers. Does not support web search or image input. |
Qwen
Strong for multimodal chat, tool use, and general flagship work. Available on all tiers through Fireworks serverless. Supports image input, but not web search.
| Model | Requests | Best For |
|---|---|---|
| Qwen 3.6 Plus | 2 base requests | Alibaba's flagship closed Qwen model via Fireworks. 396B MoE with function calling and image input support. Available serverless on just4o.chat at the base tier. Uses 2 base requests per send before length multipliers. Does not support web search. |
Moonshot (Kimi)
Good for complex reasoning, multimodal agentic tasks, and long-horizon coding. Kimi K2.5 and K2.6 support images. No web search.
| Model | Requests | Best For |
|---|---|---|
| Kimi K2.5 | 1 premium request | Moonshot AI's flagship agentic model via Fireworks. Unifies vision and text, thinking and non-thinking. 262k context. Supports image input. Does not support web search. |
| Kimi K2.6 | 2 premium requests | Moonshot AI's Kimi K2.6 via Fireworks: open-source, native multimodal agentic model for long-horizon coding, coding-driven design, autonomous execution, and task orchestration. 1T MoE, 262k context. Supports image input and function calling. Uses 2 premium requests per send before length multipliers. Does not support web search. |
MiniMax
Strong for coding, complex tasks, and office work. MiniMax M2.7 supports image input. Available on all tiers. No web search.
| Model | Requests | Best For |
|---|---|---|
| MiniMax M2.5 | 1 base request | MiniMax M2.5 via Fireworks: state-of-the-art coding, agentic tool use, search, and office work. 228B MoE, 196k context. Function calling supported. Does not support web search or image input. |
| MiniMax M2.7 | 1 base request | MiniMax M2.7 via Fireworks: 228B MoE model for complex agent harnesses, productivity tasks, Agent Teams, Skills, and dynamic tool search. 196k context. Supports image input and function calling. Does not support web search. |
Z.ai (GLM)
Strong for coding, reasoning, and long-horizon agentic workflows. GLM models are available through Fireworks and Cerebras; GLM 4.7 Fast is the Cerebras-backed Premium OSS variant. No web search or images.
| Model | Requests | Best For |
|---|---|---|
| GLM 4.7 Fast | 2 premium requests | Z.ai's GLM-4.7 routed through Cerebras chat completions for lower-latency coding and agentic work. 131k context. Does not support web search or image input. Cerebras currently lists it as a preview model. |
| Model | Requests | Best For |
|---|---|---|
| GLM 5.1 | 2 premium requests | Z.ai's GLM-5.1 via Fireworks: next-generation flagship for agentic engineering, stronger coding, and sustained long-horizon task performance. 202k context. Uses 2 premium requests per send before length multipliers. Does not support web search, image input, or function calling. |