2026-04-29 · HuggingFace

Granite 4.1 LLMs: How They’re Built

modelsinfrastructure

Granite 4.1 LLMs: How They’re Built

Source: HuggingFace Date: 2026-04-29 URL: https://huggingface.co/blog/ibm-granite/granite-4-1

Summary

IBM released Granite 4.1, a family of dense decoder-only models (3B, 8B, 30B) under Apache 2.0, trained on ~15 trillion tokens across five curriculum phases with a 512K-token context window and strong tool-calling benchmarks (BFCL V3: 68.3 at 8B). The 8B model matches or exceeds the prior 32B MoE variant on most tasks, and the training pipeline uses multi-stage on-policy GRPO with explicit calibration and identity RL stages. No extended reasoning traces — latency and token usage stay predictable.

Implications

The 8B-beats-32B-MoE result continues the trend of dense models closing the gap with mixture-of-experts through better data curricula rather than parameter scale — directly relevant to local deployment economics.
Strong tool-calling scores and 512K context make Granite 4.1 a credible open-weight base for agentic workflows requiring long-document processing and function dispatch without a cloud dependency.
Apache 2.0 + FP8 quantization + IBM enterprise support is a meaningful combination for teams that need deployable open models with a vendor backstop — differentiating from Meta’s Llama licensing stance.
The explicit identity and knowledge-calibration RL stages are noteworthy: IBM is treating self-identification consistency as a first-class training objective, signaling maturation of the post-training pipeline beyond raw capability benchmarks.

← all signals