2024-09-12 · OpenAI

Introducing OpenAI o1

modelsinfrastructure

Introducing OpenAI o1

Source: OpenAI Date: 2024-09-12 URL: https://openai.com/index/introducing-openai-o1-preview

Summary

OpenAI’s launch announcement for o1 (preview), the first publicly released model in the “reasoning” series that uses extended chain-of-thought before responding. Launched September 2024, o1-preview showed dramatically improved performance on math, coding, and science benchmarks — particularly competition math (AIME) and PhD-level science questions (GPQA) — by spending compute at inference time rather than only at training time. It was positioned alongside o1-mini as a faster, cheaper reasoning-optimized variant.

Implications

The inference-compute shift. o1 is the clearest marker of the transition from “smarter training” to “smarter thinking at runtime.” This changes the cost structure of AI dramatically: instead of paying for a larger model, you pay for more compute per query. It also changes latency expectations — o1 is slow by design. The competitive implications took months to land: Anthropic’s extended thinking in Claude 3.7 and DeepSeek R1 were both direct responses to this paradigm.

Benchmark redefinition. AIME and GPQA became the new standard evaluation surface after o1. Every subsequent model announcement in late 2024 and 2025 was measured against o1’s scores on these benchmarks — which is exactly what OpenAI wanted. They set the eval, then released GPT-5.5 partly to retire that surface once it saturated.

The o-series thread. o1 → o1-mini → o3 → o4-mini → o3-pro is one of the two major OpenAI product tracks through 2025–2026. Understanding o1’s launch is the starting point for that thread.

Watch: How reasoning-model pricing evolves as the inference-compute tradeoff becomes the primary axis of model differentiation.

← all signals