Hugging Face and FriendliAI partner to supercharge model deployment on the Hub
read at source ↗ huggingface.co
Hugging Face and FriendliAI partner to supercharge model deployment on the Hub
Source: HuggingFace Date: 2025-01-22 URL: https://huggingface.co/blog/friendliai-partnership
Summary
Partnership announcement: FriendliAI inference integrated into HF Hub’s “Deploy this model” button. Two options: dedicated endpoints on H100 GPUs, and serverless endpoints for popular open-source models. FriendliAI claims “fastest GPU-based generative AI inference” ranking from Artificial Analysis. No specific latency numbers in the post.
Implications
Thread: HF as open-source ML hub. This continues HF’s pattern of making inference partners available directly from model cards — each partnership (FriendliAI, AWS, GCP, Fireworks, etc.) extends the “one-click deploy” options for users who don’t want to manage their own infrastructure. FriendliAI’s speed claim is worth verifying independently against Artificial Analysis data. The serverless option is the strategically interesting one: it competes directly with HF’s own serverless inference tier on price/performance.