2025-12-19 · Anthropic

Introducing Bloom: an open source tool for automated behavioral evaluations

agentsmodelsenterprise

Introducing Bloom: an open source tool for automated behavioral evaluations

Source: Anthropic Research Date: 2025-12-19 URL: https://www.anthropic.com/research/bloom

Summary

Bloom: open-source four-stage agentic evaluation pipeline (understand → ideate → rollout → judge). Researcher specifies only a behavior description; system generates scenarios, executes interactions, scores transcripts. 0.86 Spearman correlation between Claude Opus 4.1 judge and human labels. Distinguished 9/10 intentionally misaligned model organisms from baselines. Replicated known system card results and found new insights (reasoning effort reduces self-preferential bias).

Implications

Bloom is the alignment evaluation tooling stack going fully automated and open-source — companion to Petri, which focuses on specific safety behaviors. The 0.86 correlation with human judges is high enough for screening but not for high-stakes safety certification. The reasoning effort → reduced self-preferential bias finding is the novel insight: making the model think harder before responding reduces a specific form of sycophancy. Open-sourcing Bloom alongside Petri is a deliberate norm-setting move — Anthropic is building the eval infrastructure that the field will eventually require for any responsible deployment. Watch for Bloom and Petri becoming the evaluation layer that third-party safety auditors adopt.

← all signals