2024-10-10 · HuggingFace

Introducing the AMD 5th Gen EPYC™ CPU

modelsinfrastructurecommentary

Introducing the AMD 5th Gen EPYC™ CPU

Source: HuggingFace Date: 2024-10-10 URL: https://huggingface.co/blog/huggingface-amd-turin

Summary

Integration announcement: HF validates AMD 5th gen EPYC (Turin, Zen5, up to 192 cores/384 threads) for AI workloads using zentorch (AMD ZenDNN PyTorch plugin). Benchmark on Llama 3.1 8B Instruct at bfloat16: ~2x decode throughput improvement vs. Genoa (128 vs 96 cores) across summarization, chatbot, translation, essay writing, and captioning use cases at batch sizes 16 and 32.

Implications

Thread: open-weights ecosystem health. A 2x throughput gain from the CPU generation transition is significant for CPU-only inference deployments, which are the only viable option for many on-premise and edge environments without dedicated GPU infrastructure. The zentorch + torch.compile integration suggests HF is investing in the CPU inference path seriously, not just treating GPU as the only deployment target. AMD EPYC being validated by HF for production LLM serving strengthens the non-NVIDIA compute narrative at the data center level.

← all signals