2024-12-12 · Anthropic

Clio: Privacy-preserving insights into real-world AI use

models

Clio: Privacy-preserving insights into real-world AI use

Source: Anthropic Research Date: 2024-12-12 URL: https://www.anthropic.com/research/clio

Summary

Clio: four-stage Claude-powered pipeline for privacy-preserving analysis of production conversations. Extracts facets, semantically clusters, generates summaries excluding private details, organizes into hierarchical topics. Applied to 1M Claude.ai conversations. Findings: web/mobile dev >10%, education 7%, business strategy ~6%. Discovered coordinated misuse patterns invisible in individual conversations and false negatives in safety classifiers — violations obscured through translation requests bypassing classifier.

Implications

This is the production monitoring infrastructure underlying the Economic Index and Values in the Wild research programs. The translation-based classifier bypass discovery is the most operationally important finding — attackers are routing through translation to evade safety classifiers, and Clio found it because it looks at aggregate patterns, not individual conversations. This is a surveillance vs. privacy tradeoff Anthropic is making explicit and claiming to have solved. The coordinated misuse detection capability is the other key feature: individual conversations look benign, but patterns across thousands reveal campaigns. Watch for Clio becoming the backbone of Anthropic’s ongoing safety monitoring and for the privacy-preserving methodology being cited in regulatory discussions about AI oversight obligations.

← all signals