How Does GPT-5 Work?
pricingmodels
read at source ↗ www.wheresyoured.at
How Does GPT-5 Work?
Source: Where’s Your Ed At Date: 2025-08-15 URL: https://www.wheresyoured.at/how-does-gpt-5-work/
Summary
Ed exposes the architectural reality behind GPT-5’s routing system: rather than the “smart, efficient” product OpenAI markets, the dynamic model-selection router burns upwards of double the tokens per query compared to predecessors, because it must evaluate every prompt before applying static instructions and forces a full system reset each message. The source is infrastructure insiders. Ed’s conclusion: OpenAI shipped a rushed, overcomplicated product whose actual cost structure contradicts its efficiency marketing.
Implications
- AI financial sustainability. Double token burn per query means OpenAI’s serving costs are materially higher than what their pricing and public framing suggests — a direct pressure on the path to profitability.
- Generative AI ROI critique. If the leading model is architecturally inefficient and OpenAI can’t acknowledge it, the enterprise integrators pricing AI into their workflows are building on false cost assumptions.
- Vendor BS detection. “Efficiency” as marketing while infrastructure sources say the opposite is the canonical Ed story. Watch for the same pattern in subsequent OpenAI model launches.