The Complete Guide to ChatGPT o3: Includes Tests vs. Gemini 2.5 Pro, Performance at ~12 Job Skills, OpenAI Notes Review, Links to Other Reviews
modelscommentary
read at source ↗ natesnewsletter.substack.com
The Complete Guide to ChatGPT o3: Includes Tests vs. Gemini 2.5 Pro, Performance at ~12 Job Skills, OpenAI Notes Review, Links to Other Reviews
Source: Nate’s Newsletter Date: 2025-04-17 URL: https://natesnewsletter.substack.com/p/the-complete-guide-to-chatgpt-o3
Summary
Nate positions ChatGPT o3 as “the world’s most generally useful everyday model” in April 2025, tested against Gemini 2.5 Pro across ~12 job skills. The nuanced case: model capabilities are converging for routine tasks, but significant performance gaps persist on demanding work — making model selection most important at the high end. AGI claims are “definitely overrated” (o3 failed to write his article), but the capability leap for difficult analytical work is genuine.
Implications
- AI product positioning thread. The “models converge at median, diverge at the frontier” pattern has direct product implications: for most users and most tasks, model quality differences don’t matter; for power users and demanding tasks, they matter enormously. AI products need to serve both segments with appropriate model routing.
- AI economics thread. “Using the best model across most tasks makes economic sense because elite models deliver measurable advantages on your most impactful work” — this is the ROI case for frontier model subscriptions. The premium is justified when the work is consequential.
- Watch: How the o3/Gemini 2.5 Pro competitive gap evolved through Q2-Q3 2025 as subsequent model releases reshuffled the leaderboard.