NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets
read at source ↗ huggingface.co
NVIDIA’s GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets
Source: HuggingFace Date: 2025-03-18 URL: https://huggingface.co/blog/nvidia-physical-ai
Summary
Model and dataset releases at GTC 2025: (1) Cosmos Transfer (7B) — world foundation model generating photorealistic video from multicontrol sensor inputs (segmentation, depth, LiDAR, bounding boxes) for synthetic robotics/AV training data; (2) Physical AI Dataset — 15TB, 320k+ robotics trajectories, up to 1000 OpenUSD assets on HF; (3) GR00T N1 (2B) — humanoid robot foundation model with dual-system architecture (NVIDIA-Eagle + SmolLM-1.7B for reasoning, diffusion transformer for continuous control actions), supporting multiple humanoid platforms. No benchmark numbers.
Implications
Open-weights ecosystem health. NVIDIA releasing GR00T N1 as an open foundation model for humanoid robots on HF — trained on real data, synthetic data, and internet-scale video — is a significant signal that physical AI infrastructure is following the LLM pattern: foundational models open-sourced to drive ecosystem adoption, with commercial value captured at the infrastructure/deployment layer (NIM, DGX).
Model release cadence (hardware/robotics). The dual-system architecture in GR00T N1 (slow VLM reasoning + fast diffusion action generation) mirrors the System 1/System 2 framing from cognitive science and is the dominant architectural pattern emerging for embodied AI. Cosmos Transfer’s multicontrol ControlNet approach for generating sensor-modality-specific synthetic training data addresses the data scarcity problem that has blocked robotics ML progress — watch this dataset to see if it accelerates manipulation benchmarks.