Welcome to Inference Providers on the Hub π₯
read at source β huggingface.co
Welcome to Inference Providers on the Hub π₯
Source: HuggingFace Date: 2025-01-28 URL: https://huggingface.co/blog/inference-providers
Summary
Platform feature launch: HuggingFace launches the Inference Providers program, integrating fal.ai, Replicate, SambaNova, and Together AI as serverless inference backends accessible directly from HF model pages and the huggingface_hub SDK (v0.28.0+). Users can route requests through HF (no markup, billed to HF account) or use their own provider API key directly. PRO users get $2/month in credits. The HTTP router at router.huggingface.co/{provider} provides a unified endpoint.
Implications
HF as open-source ML hub. Inference Providers is HFβs answer to the fragmentation of open-weights inference: instead of requiring users to manage separate API keys and SDK integrations for Together AI, Replicate, and SambaNova, HFβs router provides a single SDK interface to all of them. This positions HF as the inference aggregation layer β the one integration teams need regardless of which inference provider wins on price or speed.
Open-weights ecosystem health. The βno markup, pass-through pricingβ model is a significant commitment: HF is forgoing revenue on inference to prioritize ecosystem integration over monetization. The $2/month PRO credit is a subsidy to encourage exploration. Watch whether this model is sustainable as the provider list grows or whether HF eventually adds a routing fee.