Introducing Storage Buckets on the Hugging Face Hub
read at source ↗ huggingface.co
Introducing Storage Buckets on the Hugging Face Hub
Source: HuggingFace Date: 2026-03-10 URL: https://huggingface.co/blog/storage-buckets
Summary
Platform feature release: HF Hub Storage Buckets — S3-like mutable object storage built on Xet’s chunk-based deduplication backend, designed for ML artifacts that don’t need versioning (checkpoints, optimizer states, logs, agent traces). Accessible via Hub web UI, CLI, or Python SDK (v1.5.0+). Pre-warming to AWS/GCP compute regions. Enterprise billing on deduplicated storage, not raw bytes. No benchmark numbers; deduplication benefit is the core efficiency claim — successive checkpoints with frozen layers share chunks.
Implications
HF as open-source ML hub. Storage Buckets fills the gap between “model/dataset repos” (versioned, git-lfs/Xet) and “raw compute storage” — the artifacts that actively change during training runs have nowhere to live in the existing Hub model. This positions HF as viable infrastructure for the full ML training loop, not just model distribution.
Open-weights ecosystem health. Xet’s deduplication means checkpoint storage costs scale with delta size rather than full checkpoint size — a meaningful saving for teams training large models where checkpoints are saved frequently. The fsspec integration makes Buckets usable as a drop-in storage backend for any Python ML code that accepts filesystem paths, lowering the integration barrier.