2025-08-15 · Where's Your Ed At

How Does GPT-5 Work?

pricingmodels

How Does GPT-5 Work?

Source: Where’s Your Ed At Date: 2025-08-15 URL: https://www.wheresyoured.at/how-does-gpt-5-work/

Summary

Ed exposes the architectural reality behind GPT-5’s routing system: rather than the “smart, efficient” product OpenAI markets, the dynamic model-selection router burns upwards of double the tokens per query compared to predecessors, because it must evaluate every prompt before applying static instructions and forces a full system reset each message. The source is infrastructure insiders. Ed’s conclusion: OpenAI shipped a rushed, overcomplicated product whose actual cost structure contradicts its efficiency marketing.

Implications

AI financial sustainability. Double token burn per query means OpenAI’s serving costs are materially higher than what their pricing and public framing suggests — a direct pressure on the path to profitability.
Generative AI ROI critique. If the leading model is architecturally inefficient and OpenAI can’t acknowledge it, the enterprise integrators pricing AI into their workflows are building on false cost assumptions.
Vendor BS detection. “Efficiency” as marketing while infrastructure sources say the opposite is the canonical Ed story. Watch for the same pattern in subsequent OpenAI model launches.

← all signals