Introducing Bloom: an open source tool for automated behavioral evaluations
read at source ↗ www.anthropic.com
Introducing Bloom: an open source tool for automated behavioral evaluations
Source: Anthropic Research Date: 2025-12-19 URL: https://www.anthropic.com/research/bloom
Summary
Bloom: open-source four-stage agentic evaluation pipeline (understand → ideate → rollout → judge). Researcher specifies only a behavior description; system generates scenarios, executes interactions, scores transcripts. 0.86 Spearman correlation between Claude Opus 4.1 judge and human labels. Distinguished 9/10 intentionally misaligned model organisms from baselines. Replicated known system card results and found new insights (reasoning effort reduces self-preferential bias).
Implications
Bloom is the alignment evaluation tooling stack going fully automated and open-source — companion to Petri, which focuses on specific safety behaviors. The 0.86 correlation with human judges is high enough for screening but not for high-stakes safety certification. The reasoning effort → reduced self-preferential bias finding is the novel insight: making the model think harder before responding reduces a specific form of sycophancy. Open-sourcing Bloom alongside Petri is a deliberate norm-setting move — Anthropic is building the eval infrastructure that the field will eventually require for any responsible deployment. Watch for Bloom and Petri becoming the evaluation layer that third-party safety auditors adopt.