2026-03-23 · Anthropic

Long-running Claude for scientific computing

agentsmodelsinfrastructure

read at source ↗ www.anthropic.com

Long-running Claude for scientific computing

Source: Anthropic Research Date: 2026-03-23 URL: https://www.anthropic.com/research/long-running-Claude

Summary

Multi-day agentic workflow: Claude Opus 4.6 implements a differentiable cosmological Boltzmann solver in JAX on an HPC cluster, achieving sub-percent agreement with the reference CLASS implementation. Infrastructure: CLAUDE.md for goals, CHANGELOG.md for progress tracking, unit tests against reference implementation as oracle, git commits per work unit, “Ralph loop” pattern to prevent premature abandonment. Typical researcher timeline for this work: months to years.

Implications

This is the long-horizon scientific agent thread made concrete. The key design choices — oracle-driven testing, persistent progress files, structured failure tracking — are the scaffolding pattern that makes multi-day autonomous work viable. The “Ralph loop” pattern being named suggests this is becoming a reusable infrastructure primitive. The cosmological Boltzmann solver is a non-trivial benchmark: it requires domain-correct physics, not just syntactically valid code. Watch for Anthropic’s science blog featuring more of these domain-specific demonstrations, and for the scaffolding patterns here to influence Claude Code’s default behaviors for long-running tasks.

← all signals