Progress from our Frontier Red Team
read at source ↗ www.anthropic.com
Progress from our Frontier Red Team
Source: Anthropic Date: 2025-03-19 URL: https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team
Summary
Anthropic published its first public Frontier Red Team report covering cybersecurity and biosecurity capability evaluations. Cybersecurity: CTF performance advanced from high school to undergraduate level in one year; Cybench score ~33% (up from ~5%). Biosecurity: virology troubleshooting now exceeds expert baselines; cloning workflows exceed human expert baselines; models provide “some uplift” to novices on weaponization tasks. Risk assessment: “early warning signs” of dual-use capability growth, but below ASL-3 thresholds. Partners: US NIST, UK AISI, Carnegie Mellon, SecureBio.
Implications
- Safety/research / the most honest Anthropic capability disclosure to date. The Frontier Red Team report is unusually specific — named benchmarks, quantified improvement rates, partner institutions. The “virology troubleshooting exceeds expert baselines” finding is the most alarming published claim Anthropic had made about its own models.
- Cybench 5%→33% in one year. The trajectory (6x improvement on CTF challenges in 12 months) is the number that justifies the ASL-3 activation two months later (May 2025). The report is the public technical evidence base for the policy decision.
- “Some uplift” to novices on weaponization. The careful framing (“some” uplift, “critical failures remain”) is important — it acknowledges risk without claiming catastrophic capability. The nuance is load-bearing: too alarming would be irresponsible, too dismissive would be incorrect.
- Government partners named. US NIST and UK AISI as named partners gives the evaluation methodology credibility and establishes the government-academic-industry collaboration pattern that the subsequent CISA collaboration formalizes.
- Watch: the next Frontier Red Team report and whether the trajectory continued; whether the weaponization “some uplift” finding tightened or widened in subsequent evaluations; how the Carnegie Mellon CTF benchmark scores evolved.