Long-running Claude for scientific computing
read at source ↗ www.anthropic.com
Long-running Claude for scientific computing
Source: Anthropic Research Date: 2026-03-23 URL: https://www.anthropic.com/research/long-running-Claude
Summary
Multi-day agentic workflow: Claude Opus 4.6 implements a differentiable cosmological Boltzmann solver in JAX on an HPC cluster, achieving sub-percent agreement with the reference CLASS implementation. Infrastructure: CLAUDE.md for goals, CHANGELOG.md for progress tracking, unit tests against reference implementation as oracle, git commits per work unit, “Ralph loop” pattern to prevent premature abandonment. Typical researcher timeline for this work: months to years.
Implications
This is the long-horizon scientific agent thread made concrete. The key design choices — oracle-driven testing, persistent progress files, structured failure tracking — are the scaffolding pattern that makes multi-day autonomous work viable. The “Ralph loop” pattern being named suggests this is becoming a reusable infrastructure primitive. The cosmological Boltzmann solver is a non-trivial benchmark: it requires domain-correct physics, not just syntactically valid code. Watch for Anthropic’s science blog featuring more of these domain-specific demonstrations, and for the scaffolding patterns here to influence Claude Code’s default behaviors for long-running tasks.