2025-03-19 · Anthropic

Progress from our Frontier Red Team

models

read at source ↗ www.anthropic.com

Progress from our Frontier Red Team

Source: Anthropic Date: 2025-03-19 URL: https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team

Summary

Anthropic published its first public Frontier Red Team report covering cybersecurity and biosecurity capability evaluations. Cybersecurity: CTF performance advanced from high school to undergraduate level in one year; Cybench score ~33% (up from ~5%). Biosecurity: virology troubleshooting now exceeds expert baselines; cloning workflows exceed human expert baselines; models provide “some uplift” to novices on weaponization tasks. Risk assessment: “early warning signs” of dual-use capability growth, but below ASL-3 thresholds. Partners: US NIST, UK AISI, Carnegie Mellon, SecureBio.

Implications

  • Safety/research / the most honest Anthropic capability disclosure to date. The Frontier Red Team report is unusually specific — named benchmarks, quantified improvement rates, partner institutions. The “virology troubleshooting exceeds expert baselines” finding is the most alarming published claim Anthropic had made about its own models.
  • Cybench 5%→33% in one year. The trajectory (6x improvement on CTF challenges in 12 months) is the number that justifies the ASL-3 activation two months later (May 2025). The report is the public technical evidence base for the policy decision.
  • “Some uplift” to novices on weaponization. The careful framing (“some” uplift, “critical failures remain”) is important — it acknowledges risk without claiming catastrophic capability. The nuance is load-bearing: too alarming would be irresponsible, too dismissive would be incorrect.
  • Government partners named. US NIST and UK AISI as named partners gives the evaluation methodology credibility and establishes the government-academic-industry collaboration pattern that the subsequent CISA collaboration formalizes.
  • Watch: the next Frontier Red Team report and whether the trajectory continued; whether the weaponization “some uplift” finding tightened or widened in subsequent evaluations; how the Carnegie Mellon CTF benchmark scores evolved.

← all signals