2026-05-14 · HuggingFace

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

infrastructure

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

Source: HuggingFace Date: 2026-05-14 URL: https://huggingface.co/blog/ibm-granite/granite-embedding-multilingual-r2

Summary

IBM releases two ModernBERT-based multilingual embedding models under Apache 2.0: a 97M-parameter variant and a 311M-parameter variant. The 97M model scores 60.3 on MTEB Multilingual Retrieval — highest among open sub-100M multilingual embedders, and competitive with models three times its size. Both ship with 32,768-token context (up from 512 in R1), support 200+ languages, and include nine programming languages; the 311M variant adds Matryoshka embeddings for graceful dimension reduction. CPU-deployable via ONNX/OpenVINO, no GPU required, and drop-in compatible with LangChain, LlamaIndex, vLLM, and Milvus.

Implications

This feeds the open embedding infrastructure thread and the broader pattern of enterprise-grade retrieval components becoming freely available at production quality.

Size-efficiency as competitive axis. The 97M model matching 300M-class quality at 3× fewer parameters is the operative number — it means retrieval pipelines that were GPU-bound can move to CPU inference, which changes the economics of local-first RAG considerably.
32K context closes the long-document gap. Prior open multilingual embedders stalled at 512 tokens; the jump to 32K makes whole-contract or whole-paper embedding practical without chunking workarounds.
Apache 2.0 removes the license friction. IBM curated the training data specifically to avoid MS-MARCO and non-commercial licenses — this is designed for enterprise adoption without legal review overhead, which matters for any production RAG deployment.
Watch: Whether the 311M Matryoshka variant becomes a default recommendation in LlamaIndex/LangChain stacks the way bge-m3 has been — if integration guides start referencing it, adoption will accelerate quickly.

← all signals