Accelerate 1.0.0
read at source ↗ huggingface.co
Accelerate 1.0.0
Source: HuggingFace Date: 2024-09-13 URL: https://huggingface.co/blog/accelerate-v1
Summary
Library update: Accelerate 1.0.0, the first stable major release of HF’s distributed training framework. 100M+ downloads, 99% PyTorch compatibility. New in 1.0: FP8 support (MS-AMP and TransformersEngine), multi-model orchestration with DeepSpeed (experimental), torch.compile for big model inference, torch.distributed.pipelining, and StatefulDataLoader. Supports 6 hardware backends: CPU, GPU, TPU, XPU, NPU, MLU. Underpins transformers, diffusers, PEFT, TRL.
Implications
Thread: transformers library trajectory. Accelerate 1.0 reaching stability after 3.5 years matters because it is the common substrate — nearly every HF training library depends on it. The FP8 support is the most important new addition: FP8 training is the key to reducing memory footprint on modern H100/H200 hardware without a precision penalty. The multi-model DeepSpeed orchestration is experimental but points toward multi-model training pipelines becoming first-class. Breaking changes are minimal and well-documented — a clean 1.0 API signals the library is ready for production dependency pinning.