2025-10-03 · Anthropic

Building AI for cyber defenders

models

Building AI for cyber defenders

Source: Anthropic Research Date: 2025-10-03 URL: https://www.anthropic.com/research/building-ai-cyber-defenders

Summary

Claude Sonnet 4.5 deliberately enhanced for cybersecurity defense while avoiding offensive enhancements. Evaluated on Cybench (CTF challenges) and CyberGym (real open-source software with known and novel vulnerabilities). Cybench: 76.5% success with 10 trials, up from 35.9% in February. CyberGym: discovered new vulnerabilities in 33% of projects vs. 2% for earlier models. Validated with HackerOne and CrowdStrike.

Implications

This is the cyber defenders thread — a careful asymmetric capability bet. The 76.5% CTF success rate and 33% novel vulnerability discovery rate represent a step change in automated vulnerability finding. The deliberate defensive-only framing (enhanced detection, not exploitation) is Anthropic’s attempt to own the “AI for defense” narrative before competitors frame it differently. The asymmetry claim (defenders benefit more than attackers) is politically important but empirically uncertain — the same vulnerability discovery capability that finds bugs to patch also finds bugs to exploit. Watch for this capability threshold triggering RSP evaluation requirements and for CrowdStrike/HackerOne integrations signaling enterprise security as an active commercial push.

← all signals