Diffusers welcomes Stable Diffusion 3.5 Large
read at source ↗ huggingface.co
Diffusers welcomes Stable Diffusion 3.5 Large
Source: HuggingFace Date: 2024-10-22 URL: https://huggingface.co/blog/sd3-5
Summary
Model release: Stable Diffusion 3.5 Large (8B, MMDiT) arrives in Diffusers with two checkpoints: standard (40 steps, guidance_scale=4.5) and turbo (4–8 steps, no CFG). Architectural changes from SD3: QK normalization and dual attention layers in MMDiT blocks. Bitsandbytes NF4 quantization support enables consumer GPU inference; LoRA training fits in 24GB VRAM. Text encoders, VAE, and noise scheduler identical to SD3 Medium.
Implications
Thread: open-weights ecosystem health / transformers library trajectory. SD3.5 Large’s dual attention layers and QK normalization are architectural improvements inherited from scale: both are standard practices for stabilizing large transformer training. The NF4 quantization + consumer GPU path is the practical democratization story — 8B model inference accessible on 24GB cards is a significant shift from the 18.7GB baseline SD3 Medium required. The turbo variant (4–8 steps, no CFG) trades quality for speed in a way that’s deployment-practical for interactive applications. LoRA training on consumer hardware means community fine-tuning capacity exists immediately at release.