2024-07-16 · HuggingFace

SmolLM - blazingly fast and remarkably powerful

models

read at source ↗ huggingface.co

SmolLM - blazingly fast and remarkably powerful

Source: HuggingFace Date: 2024-07-16 URL: https://huggingface.co/blog/smollm

Summary

Model release: SmolLM family in 135M, 360M, and 1.7B — small language models from HF’s team trained on the SmolLM-Corpus (Cosmopedia v2 28B tokens, FineWeb-Edu 220B tokens, Stack-Edu-Python 4B tokens). Training: 600B tokens for 135M/360M, 1T tokens for 1.7B. Architecture: GQA, depth-over-width. Benchmarks: SmolLM-1.7B HumanEval 24 pass@1, claims best-in-class under 2B (outperforms Phi1.5, MobileLM-1.5B, Qwen2-1.5B). Models fit within iPhone 15 DRAM budget; ONNX, GGUF, and WebGPU demos available.

Implications

Open-weights ecosystem health. The SmolLM-Corpus methodology — synthetic textbooks from Mixtral, educational web filtering with a trained classifier, curated Python code — is the training data contribution that matters here as much as the models themselves. The corpus is released publicly, making the data pipeline reproducible for teams training their own small models.

Model release cadence. SmolLM-1.7B beating Qwen2-1.5B and Phi1.5 on most benchmarks while being smaller confirms the data-quality-over-scale thesis: better filtered training data produces better small models than scale alone. The WebGPU inference path at 135M and 360M establishes a credible in-browser LLM deployment pipeline that requires no server infrastructure.

← all signals