GPT-5.4 mini
OpenAI · Active · Updated May 10, 2026
OpenAI's most cost-efficient small model, redesigned for high-volume production with improved reasoning and speed.
Input Price
$0.75/M
per million tokens
Output Price
$4.50/M
per million tokens
Context Window
262,144
tokens
Max Output
16,384
tokens
Technical Specifications
| Provider | OpenAI |
| Release Date | March 15, 2026 |
| Pricing Type | per token |
| Input Price | $0.75.00 / 1M tokens |
| Output Price | $4.5.00 / 1M tokens |
| Cached Input | $0.07 / 1M tokens |
| Context Window | 262,144 tokens |
| Max Output | 16,384 tokens |
| Input Modalities | text, image |
| Output Modalities | text |
| Status | active |
| Availability | api |
| Latency | very fast |
| Rate Limit | 30,000 RPM |
| Pricing URL | View official pricing → |
| Docs URL | — |
Capability Scores
Coding76
Reasoning72
Math74
Image68
Speed94
Overview
GPT-5.4 mini is OpenAI's cost-optimized small model, designed for high-volume, low-latency applications. It inherits the 256K context window from its larger sibling while delivering dramatically better reasoning and coding performance than its predecessor GPT-4o mini. At just $0.20 per million input tokens, it offers the best price-to-performance ratio in OpenAI's lineup.
Pros
- +Extremely affordable — $0.20/M input tokens
- +Very fast inference — highest speed score (94/100)
- +256K context window — double the previous generation
Cons
- −Lower accuracy on complex reasoning and coding tasks
- −Struggles with nuanced instruction following
- −Text-only output, no audio or image generation
Compare with Alternatives
Use Cases
High-volume text classification and routing
Chatbots and customer service at scale
Simple content generation with long-context support