2025-02-12 · HuggingFace

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

models

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Source: HuggingFace Date: 2025-02-12 URL: https://huggingface.co/blog/from-chunks-to-blocks

Summary

Infrastructure update: HF Hub shifts from 64KB chunks to 64MB blocks (1000x aggregation) for content-defined chunking storage, yielding 2-3x upload/download speedup and ~50% storage reduction. Example: gemma-2-9b-it-GGUF upload drops from 509 to 258 minutes; 191GB repo compressed to 97GB. “Key chunks” (0.1% sample) handle global deduplication queries without per-chunk metadata overhead.

Implications

Thread: HF as open-source ML hub. Hub transfer performance is infrastructure table stakes — large model uploads at 500+ minute runtimes are a real barrier to Hub-first workflows. The 2x speedup on large GGUF repos is the most practically relevant improvement for the community. The block aggregation design is sound engineering: pure CDC at 64KB doesn’t scale to Hub volumes; aggregating to 64MB blocks trades some theoretical deduplication for manageable storage metadata. The XetHub acquisition appears to be the origin of this infrastructure improvement — a concrete payoff from that deal becoming visible in production.

← all signals