2024-09-06 · Anthropic

Circuits Updates – August 2024

research

read at source ↗ www.anthropic.com

Circuits Updates – August 2024

Source: Anthropic Research Date: 2024-09-06 URL: https://www.anthropic.com/research/circuits-updates-august-2024

Summary

Monthly Circuits Updates dispatch from August 2024 — preliminary experiments and developing ideas from Anthropic’s interpretability team. Full content at transformer-circuits.pub (access restricted). Characterized as lab meeting notes rather than mature findings, covering ideas the team expects to publish on in future months.

Implications

Part of the mechanistic interpretability cadence. August 2024 falls after the Towards Monosemanticity paper and the circuit-tracing tools open-sourcing, placing this in the scaling and application phase of the SAE program. Items in this window likely concern scaling sparse autoencoders to larger models, feature geometry, or early circuit-tracing results in production models. Track against subsequent full papers for what ideas from this period solidified.

← all signals