2025-03-18 · HuggingFace

Xet is on the Hub

enterpriseinfrastructure

Xet is on the Hub

Source: HuggingFace Date: 2025-03-18 URL: https://huggingface.co/blog/xet-on-the-hub

Summary

Infrastructure update: HuggingFace is migrating repository storage from Git LFS to Xet, a content-addressed storage system with chunk-level (64KB) deduplication. Real-world impact: a 1MB update to a 5GB file takes ~0.1 seconds with Xet vs. 13 minutes with LFS (which re-uploads the full file). First migration moved 4.5TB of data, shifting ~6% of Hub download traffic to Xet; a block format fix during rollout reduced GET latency by ~35%. Legacy clients remain compatible via an LFS bridge.

Implications

HF as open-source ML hub. Xet’s chunk-level deduplication is specifically valuable for ML artifacts — model weights change incrementally between checkpoints, and large datasets are frequently versioned. If Xet generalizes across the Hub, the storage cost for the open-weights ecosystem drops materially, and iteration cycles (checkpoint → upload → share) become faster.

Open-weights ecosystem health. Fast incremental upload/download removes a practical friction point for community experimentation: fine-tuned model variants, dataset updates, and checkpoint sharing all become cheaper to share publicly. This is infrastructure that quietly increases the velocity of open-source ML without any model architecture change.

← all signals