Claude Code and Codex bet on different harnesses. Your team is compounding one of them every week + 2 prompts to audit which.
securityagentsmodelsenterprise
read at source ↗ natesnewsletter.substack.com
Claude Code and Codex bet on different harnesses. Your team is compounding one of them every week + 2 prompts to audit which.
Source: Nate’s Newsletter Date: 2026-03-06 URL: https://natesnewsletter.substack.com/p/same-model-78-vs-42-the-harness-made
Summary
Nate argues that AI tool selection decisions are really harness decisions — the surrounding infrastructure of memory, tool access, and operational environment matters more than the model itself. The same model achieves 78% in one harness and 42% in another. Claude Code (full machine access, persistent project memory) and Codex (isolated sandbox, private reasoning, finished-work delivery) represent fundamentally different architectural bets, and teams compound their dependency on one harness every week, making switching costs escalate with scale.
Implications
- Agent-product positioning thread. Harness-not-model is the correct competitive frame for coding agent products: Claude Code and Codex are not competing on model quality but on architectural philosophy (integrated environment vs. sandboxed isolation). The compounding lock-in dynamic means the decision a team makes today is structurally harder to reverse in six months.
- AI economics thread. The $2B company spending 100% of revenue on API costs example signals that harness economics are systematically underweighted in tool selection. The cost is not just the model API — it’s the entire operational stack, and teams that don’t model this will be surprised when switching costs surface.
- Watch: Whether the harness-evaluation framework becomes standard in enterprise AI tool selection, and whether Anthropic or OpenAI’s architectural bet proves more durable as agent workflows mature.