2024-07-18 · OpenAI

GPT-4o mini: advancing cost-efficient intelligence

pricingmodelsinfrastructure

GPT-4o mini: advancing cost-efficient intelligence

Source: OpenAI Date: 2024-07-18 URL: https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence

Summary

OpenAI’s July 2024 launch of GPT-4o mini — a significantly cheaper and faster variant of GPT-4o designed to replace GPT-3.5 Turbo as the default small model in the API. GPT-4o mini was priced at $0.15 per million input tokens (vs. GPT-3.5 Turbo’s $0.50), made GPT-4o’s vision capabilities available at small-model prices, and outperformed GPT-3.5 Turbo on most benchmarks. It became the immediate default recommendation for high-volume, latency-sensitive applications.

Implications

The small model tier gets competitive. GPT-4o mini’s launch put pressure on every small model offering — Anthropic’s Claude Haiku, Google’s Gemini Flash, and Mistral Small. The combination of GPT-4o quality at small-model prices, with vision included, made it extremely difficult for competitors to match the value proposition. The price point ($0.15/1M tokens) also pushed the industry toward sub-$0.20 pricing for capable small models.

Replacing GPT-3.5 Turbo. The explicit positioning as a GPT-3.5 Turbo replacement closed off a pricing segment that many applications had been built around. Developers with GPT-3.5 Turbo integrations could upgrade to 4o mini with better performance at lower cost — a genuine upgrade path with no downside, which accelerated migration.

Democratizing multimodality. Vision at small-model prices means every application can include image understanding without paying premium rates. This unlocked a wave of vision-enabled applications that were cost-prohibited at GPT-4V pricing.

Watch: Whether GPT-4o mini maintained its price/performance leadership as Claude Haiku 3.5 and Gemini Flash 2.0 launched in subsequent months, and whether the small model tier continued to compress in price through 2025.

← all signals