2026-06-09 · HuggingFace

Introducing North Mini Code: Cohere’s First Model For Developers

agents

read at source ↗ huggingface.co

Introducing North Mini Code: Cohere’s First Model For Developers

Source: HuggingFace Date: 2026-06-09 URL: https://huggingface.co/blog/CohereLabs/introducing-north-mini-code

Summary

North Mini Code is a 30B-parameter Mixture-of-Experts model with 3B active parameters, Apache 2.0 (bf16 + fp8 weights on Hugging Face), purpose-built for agentic software engineering — “the first model in Cohere’s new family of models.” It supports 128K context (trained long-to-longer from 64K), scores 33.4 on Artificial Analysis’ Coding Index, 80.2% pass@10 on SWE-Bench Verified, and 55.1% pass@10 on Terminal-Bench v2 (SFT stage; RLVR adds +7.9% abs pass@1 on Terminal-Bench v2, +3.0% on SWE-Bench). It was trained across multiple agent scaffolds rather than one, for framework robustness. Cohere positions it above similarly-sized Qwen3.5 / Gemma 4 / Devstral Small 2 and above substantially larger Nemotron 3 Super / Mistral Small 4 / Devstral 2.

Implications

Feeds the open-weight coding-model and coding-agent competition threads.

  • Cohere is a new open-weight coding contributor. Adds to Google (Gemma), Alibaba (Qwen), Zhipu (GLM), NVIDIA (Cosmos) — the “open frontier is narrowing” read (post-Meta-Muse-Spark) gets another counterexample. Cohere, historically enterprise-RAG-focused, entering open coding models is a notable repositioning.
  • 3B active params = the sub-agent/worker tier economics. Like Mellum 2 (12B-A2.5B), the low active count means fast inference and tractable offload; the 30B total at fp8 (~30GB) exceeds the Mac model budgets but Q4 (~15–16GB) fits, and the 3B active keeps it fast even offloaded. A candidate local coding worker.
  • Trained on multiple scaffolds for robustness mirrors the cross-host/ACP reality: a coding model now has to work across many agent harnesses, not one — robustness-to-scaffold is becoming a model-training target.

← all signals