Democratizing AI Safety with RiskRubric.ai
read at source ↗ huggingface.co
Democratizing AI Safety with RiskRubric.ai
Source: HuggingFace Date: 2025-09-18 URL: https://huggingface.co/blog/riskrubric
Summary
Research initiative: RiskRubric.ai, a standardized AI model risk assessment framework from Cloud Security Alliance, Noma Security, Haize Labs, and Harmonic Security. Six pillars: Transparency, Reliability, Security, Privacy, Safety, Reputation — each scored 0-100, rolling up to A-F letter grades. Testing methodology: 1,000+ reliability tests, 200+ adversarial probes. September 2025 findings across the HF model ecosystem: scores range 47-94 (median 81); 54% earn A or B; safety scores track closely with security hardening; stricter guardrails can hurt transparency unless paired with explanatory refusals.
Implications
Open-weights ecosystem health. A 47-94 score range with 46% of models in the C-F band is a meaningful data point: a significant fraction of HF models have material security gaps by this framework’s standards. If RiskRubric gains adoption as a procurement standard, it would create selection pressure toward more robustly safety-tested models and against the long tail of minimally-tested community releases.
HF as open-source ML hub. RiskRubric publishing results as an HF dataset positions HF as the distribution layer for model risk scores — extending the Hub’s role from weights distribution to risk metadata. If this integrates with model cards, it becomes a filtering criterion that enterprise buyers can apply at procurement time.