Arc Virtual Cell Challenge: A Primer
read at source ↗ huggingface.co
Arc Virtual Cell Challenge: A Primer
Source: HuggingFace Date: 2025-07-18 URL: https://huggingface.co/blog/virtual-cell-challenge
Summary
Research primer: educational explainer for Arc Institute’s Virtual Cell Challenge — a competition to predict the effect of CRISPR gene silencing on cells using ~300k single-cell RNA sequencing profiles. The baseline architecture (STATE) combines a protein language model (ESM2, 15B parameters) for embeddings with a Llama-backbone transformer for predicting perturbed transcriptomes. The STATE embedding model is being added to HuggingFace transformers. No benchmark numbers published; this is the primer, not a results paper.
Implications
HF as open-source ML hub. Arc Institute hosting the challenge infrastructure on HF and integrating the baseline model into transformers signals that scientific ML (bio, chemistry) is a growing segment of the Hub’s use case — not just NLP and vision. Competition infrastructure on HF normalizes the Hub as the venue for ML research challenges beyond NLP leaderboards.
Open-weights ecosystem health. A BERT-like autoencoder plus a Llama backbone applied to single-cell genomics illustrates how open-weights components are being repurposed into scientific domains — the ecosystem is now broad enough to supply foundation components for specialized biological AI without training from scratch.