2024-12-09 · HuggingFace

Open Preference Dataset for Text-to-Image Generation by the 🤗 Community

research

Open Preference Dataset for Text-to-Image Generation by the 🤗 Community

Source: HuggingFace Date: 2024-12-09 URL: https://huggingface.co/blog/image-preferences

Summary

Dataset release: open-image-preferences-v1, a 10K preference pair dataset annotated by 250+ community members (30K+ total responses) comparing Stable Diffusion 3.5-Large vs. FLUX.1-dev outputs across 11 style categories. Apache 2.0 licensed; includes binarized version and a FLUX.1-dev LoRA fine-tune trained on the preferences. Key finding: FLUX-dev leads on 3D, Anime, Manga styles; SD3.5-XL leads on Cinematic, Illustration, Fantasy art, Painting, Pixel art. The LoRA fine-tune improved FLUX in its initially weaker art/cinematic categories.

Implications

Open-weights ecosystem health. A 10K preference dataset for text-to-image generation is a meaningful contribution to the RLHF-for-diffusion pipeline — previously, preference data for image generation was largely proprietary (DALL-E feedback, Midjourney votes). The community annotation model (250+ annotators in 2 weeks) also demonstrates that HF can mobilize sufficient annotator throughput for preference dataset creation at scale.

HF as open-source ML hub. The Apache 2.0 license and the published LoRA adapter (not just the dataset) make this immediately usable for teams wanting to align their image generation models to human preferences. The community annotation infrastructure used here (Imgsys prompts, multi-model toxicity filtering, manual review) is a template for future open image preference dataset efforts.

← all signals