NEWS
OpenAI Drops GPT-5.4 Mini and Nano for Faster, Cheaper AI
OpenAI just released two compact powerhouses that could change how developers build AI products. 2GPT-5.4 mini and nano are the company’s most capable small models yet. They are fast, affordable, and already live for millions of users. If you thought only big, expensive models could do real work, these two are here to prove you wrong.
What Are GPT-5.4 Mini and Nano?
2 GPT-5.4 mini and nano bring many of the strengths of GPT-5.4 to faster, more efficient models designed for high-volume workloads. 2 GPT-5.4 mini significantly improves over GPT-5 mini across coding, reasoning, multimodal understanding, and tool use, while running more than 2x faster. Think of it as the sweet spot between speed and smarts. It is built for developers who need near-flagship quality but cannot afford the wait or the cost. 2 GPT-5.4 nano is the smallest, cheapest version of GPT-5.4 for tasks where speed and cost matter most. It is also a significant upgrade over GPT-5 nano. OpenAI recommends it for classification, data extraction, ranking, and coding subagents that handle simpler supporting tasks.
Both models support text and image inputs, tool use, function calling, web search, file search, and a 400,000-token context window.
10 The flurry of model releases comes as the generative AI space becomes more competitive. 10 The company had been fielding tough competition from Google’s Gemini models last year. This launch signals that OpenAI is not just chasing raw power anymore. It is also racing to own the “fast and cheap” tier of AI.
OpenAI GPT-5.4 mini nano AI model speed cost comparison
Benchmark Numbers That Matter
The performance gap between GPT-5.4 mini and the full-size GPT-5.4 is surprisingly small. Here is how the models stack up:
| Benchmark | GPT-5 Mini | GPT-5.4 Nano | GPT-5.4 Mini | GPT-5.4 (Full) |
|---|---|---|---|---|
| SWE-Bench Pro | 45.7% | 52.4% | 54.4% | 57.7% |
| OSWorld-Verified | 42.0% | 39.0% | 72.1% | 75.0% |
| GPQA Diamond | 81.6% | N/A | 88.0% | 93.0% |
| Terminal-Bench 2.0 | 38.2% | 46.3% | 60.0% | N/A |
| Toolathlon | 26.9% | N/A | 42.9% | N/A |
7 On SWE-Bench Pro, mini scored 54.4% compared to the flagship’s 57.7%, a narrow gap that matters when you’re paying 75 cents per million input tokens instead of premium rates. 13 GPT-5.4 mini scored 72.1 percent on the OSWorld Verified benchmark, just a hair behind the full GPT-5.4 at 75.0 percent. GPT-5 mini only managed 42.0 percent. That is a massive generational leap. 19 On Terminal-Bench 2.0, nano lands at 46.3% while mini reaches 60.0%. On OSWorld-Verified, nano actually scores below GPT-5 mini at 39.0% versus 42.0%, which is the one area where it does not clearly beat its predecessor. So nano is not for screen-reading tasks.
Key Takeaway: GPT-5.4 mini delivers roughly 94% of the full GPT-5.4’s coding performance at a fraction of the price and more than double the speed. For most production workflows, that tradeoff is a no-brainer.
The Rise of AI Subagents
Here is the bigger story behind this launch. OpenAI is not just releasing smaller models. It is pushing a new way of building AI systems.
2 GPT-5.4 mini is a strong fit for systems that combine models of different sizes. In Codex, for example, a larger model like GPT-5.4 can handle planning, coordination, and final judgment, while delegating to GPT-5.4 mini subagents that handle narrower subtasks in parallel.
Picture a senior architect sketching the blueprint while a team of builders does the hands-on work. 2This pattern becomes more useful as smaller models get faster and more capable. Instead of using one model for everything, developers can compose systems where larger models decide what to do and smaller models execute quickly at scale.
Companies are already putting this into practice. 22Abhisek Modi, AI engineering lead at Notion, said: “GPT-5.4 mini handles focused, well-defined tasks with impressive precision. For editing pages specifically, it matched and often exceeded GPT-5.2 on handling complex formatting at a fraction of the compute.”
28 GPT-5.4 mini, OpenAI’s latest fast version of their agentic coding model GPT-5.4, is now rolling out in GitHub Copilot. 29 GPT-5.4 mini and GPT-5.4 nano will also be rolling out in Microsoft Foundry.
Pricing, Availability, and How to Access
Here is the full pricing breakdown:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Available In |
|---|---|---|---|
| GPT-5.4 (Full) | $2.50 | $15.00 | API, Codex, ChatGPT |
| GPT-5.4 Mini | $0.75 | $4.50 | API, Codex, ChatGPT |
| GPT-5.4 Nano | $0.20 | $1.25 | API Only |
| GPT-5 Mini (Old) | $0.25 | $2.00 | Legacy |
| GPT-5 Nano (Old) | $0.05 | $0.40 | Legacy |
13 Compared to the previous mini and nano models in the GPT-5 lineup, that’s a serious price bump. According to OpenAI’s pricing page, GPT-5 mini ran $0.25 per million input tokens and $2.00 per million output tokens. GPT-5 nano was $0.05 input and $0.40 output per million tokens.
That means GPT-5.4 mini costs roughly 3x more than GPT-5 mini, and GPT-5.4 nano costs about 4x more than its predecessor. 13OpenAI likely justifies the higher prices by pointing to the performance gains, which bring these compact models much closer to the full-size versions that cost significantly more to run.
For everyday ChatGPT users, the good news is straightforward. 2In ChatGPT, GPT-5.4 mini is available to Free and Go users via the “Thinking” feature in the + menu. For all other users, GPT-5.4 mini is available as a rate limit fallback for GPT-5.4 Thinking.
2 In Codex, GPT-5.4 mini is available across the Codex app, CLI, IDE extension and web. It uses only 30% of the GPT-5.4 quota, letting developers quickly handle simpler coding tasks in Codex for about one-third the cost. 7 Nano remains API-only, a signal that OpenAI sees it primarily as infrastructure for developers rather than a consumer-facing product.
How It Stacks Up Against the Competition
OpenAI is not operating in a vacuum. 15OpenAI has officially bridged the gap in its model lineup with the debut of GPT-5.4 Mini and GPT-5.4 Nano. These models are designed to compete with high-efficiency models like Gemini 3 Flash.
Here is a quick look at the competitive landscape in the “fast and cheap” AI tier:
- Google Gemini 3.1 Flash-Lite: 21Priced at $0.25/$1.50. Hits 381 tokens/sec and scores 86.9% on GPQA Diamond. Offers a 1M-token context window versus GPT-5.4 mini’s 400K.
- Anthropic Claude Haiku 4.5: 21Priced at $1.00/$5.00. The priciest small model, but scores 73.3% on SWE-bench Verified and delivers what many developers describe as the most reliable instruction-following in its class. Runs 4-5x faster than Sonnet 4.5.
- Mercury 2 (Inception Labs): 21This is a diffusion-based model. Instead of generating text one token at a time, Mercury 2 generates tokens in parallel. The result is roughly 1,000 tokens per second on standard NVIDIA hardware.
12 GPT-5.4 nano is notably even cheaper than Google’s Gemini 3.1 Flash-Lite on pricing. But Gemini offers a much wider context window. The right choice depends entirely on your workload. 20 For reasoning-heavy tasks with little room for error, multimodality, and agentic tasks, GPT-5.4 remains the first choice. But for production pipelines where every millisecond and every cent counts, the mini and nano models open doors that were locked before.
The AI industry is entering a phase where smaller, faster, and smarter models matter just as much as the headline-grabbing flagships. OpenAI’s launch of GPT-5.4 mini and nano is not just a product update. It is a clear bet on a future where AI systems work in teams, where the biggest model does the thinking and the smaller ones do the heavy lifting. For developers, this means more options, more speed, and real savings. For everyday users, it means better AI experiences that do not make you stare at a loading screen. What do you think about these new models? Drop a comment and let us know if you have tried GPT-5.4 mini in ChatGPT yet.
-
FINANCE6 days agoZcash Patched a Double-Spend Bug as ZEC Climbed 5%
-
ENTERTAINMENT7 days agoSteam Summer Sale 2026 Locks In June 25 to July 9 Dates
-
NEWS4 weeks agoMeta Adds AI Replies to Threads, But Users Can’t Block It
-
FINANCE6 days agoCitigroup Says ETF Outflows Drove Bitcoin’s Crash, Not Strategy’s Sale
-
NEWS7 days agoGigaton Lands $26M to Replace Heavy Industry’s Control Stack
-
FINANCE6 days agoCoinbase Invests in Ethena, ENA Jumps 10% on Open-Market Buy
-
NEWS2 weeks agoLondon AI Lab Inherent Raises $50m to Reinvent Science
-
NEWS7 days agoQuobly’s €115M Bet to Scale Silicon Quantum Computing
