Announcing our updated Responsible Scaling Policy
agentsenterpriseresearch
read at source ↗ www.anthropic.com
Announcing our updated Responsible Scaling Policy
Source: Anthropic Date: 2024-10-15 URL: https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy
Summary
Anthropic published RSP v2 (October 2024), introducing a more flexible risk governance framework. Two new explicit capability thresholds that trigger enhanced safeguards: autonomous AI research and development, and CBRN weapons assistance. Implemented routine capability and safeguard assessments using safety case methodology. Acknowledged minor procedural shortfalls from the prior RSP. Co-founder Jared Kaplan named as Responsible Scaling Officer; Head of Responsible Scaling role opened.
Implications
- Safety/policy posture thread. RSP v2 after one year of implementation is the first significant revision — incorporating “practical insights” means real gaps were found in the original. The acknowledgment of “falling short of previous requirements in minor procedural ways” is a rare piece of self-criticism in a public document.
- Two capability thresholds codified. Autonomous AI R&D and CBRN assistance as the two named ASL-3 triggers are now explicit rather than implied — this creates a clearer test Anthropic and external evaluators can apply. It also limits Anthropic’s flexibility to define the threshold post-hoc.
- Jared Kaplan as RSO. Kaplan (co-founder, scaling laws pioneer) as Responsible Scaling Officer is a significant internal governance signal — the person most knowledgeable about what scaling produces is now accountable for when that scaling triggers safety escalation.
- Safety case methodology. Importing safety case methodology from high-stakes engineering (nuclear, aviation) into AI scaling decisions is a substantive methodological choice — it requires explicit arguments that safety is maintained, not just absence of identified harms.
- Watch: whether the “minor procedural shortfalls” are disclosed in detail; how the autonomous AI R&D threshold is defined operationally; whether Kaplan’s RSO role produces public communications about specific model evaluations.