Safetensors is Joining the PyTorch Foundation
read at source ↗ huggingface.co
Safetensors is Joining the PyTorch Foundation
Source: HuggingFace Date: 2026-04-08 URL: https://huggingface.co/blog/safetensors-joins-pytorch-foundation
Summary
Safetensors — the model-weight serialization format that replaced pickle as the default on Hugging Face Hub — is transferring its trademark, repository, and governance to the Linux Foundation under the PyTorch Foundation umbrella. Hugging Face retains no special authority over the format going forward. No breaking changes ship with the transition; existing APIs and Hub integration remain unchanged. Planned additions include PyTorch core serialization integration, device-aware loading (direct to CUDA/ROCm), first-class Tensor Parallel/Pipeline Parallel loading APIs, and formalized support for FP8 and block-quantized formats (GPTQ, AWQ).
Implications
- Open-weight ecosystem thread. Moving safetensors governance to the Linux Foundation removes the last credible objection to adopting it as a universal interchange format. A vendor-neutral spec with roadmap authority shared across the community is what PyTorch needs to make safetensors the
.onnxequivalent for weights — a stable target every inference runtime can implement against. The FP8 and sub-byte integer support on the roadmap is specifically relevant to quantized local model workflows. - Local model capability thread. Device-aware loading (direct to CUDA, ROCm, and prospectively Metal) eliminates the current host-memory staging step that inflates RAM requirements during model load. For 3060 12GB / 64GB RAM configurations this matters: a 10GB Q4_K_M model currently lands in host RAM before moving to VRAM; device-aware loading could cut peak host allocation by the full model size.
- Watch: timeline for PyTorch core integration, whether Apple’s Metal backend is included in the device-aware loading roadmap, and whether the foundation governance produces a formal binary compatibility guarantee for the format header.