2025-11-25 · HuggingFace

Diffusers welcomes FLUX-2

models

Diffusers welcomes FLUX-2

Source: HuggingFace Date: 2025-11-25 URL: https://huggingface.co/blog/flux-2

Summary

Model release and Diffusers integration tutorial for Black Forest Labs’ FLUX.2, a new-architecture image generation model (not a drop-in replacement for FLUX.1). Architecture changes: single Mistral Small 3.1 text encoder, 8 double-stream + 48 single-stream DiT blocks, new AutoencoderKLFlux2, bias parameters removed. Baseline requires 80+ GB VRAM; with 4-bit quantization (bitsandbytes), drops to ~20GB. Supports structured JSON prompting with hex color control and up to 10 reference images.

Implications

Thread: transformers library trajectory / open-weights ecosystem health. FLUX.2’s architecture is a significant rethink from FLUX.1 — the shift to a single text encoder (Mistral Small 3.1) and restructured DiT block ratio (73% in single-stream) signals Black Forest Labs optimizing for quality over compatibility. The 80GB baseline VRAM requirement is a real gate, but the quantization path to 8-20GB makes it accessible for local use on high-end consumer GPUs. The structured JSON prompting feature is interesting: it’s moving image generation control closer to structured API semantics rather than natural language freeform.

← all signals