Robots Are Coming 🤖 How You Can Start Building a Humanoid Today - Insights from NVIDIA GTC 2025
Last week at NVIDIA GTC 2025, I attended an insightful session by an incredible lineup - Jim Fan, Yuke Zhu, Leela Karumbunathan, and Yan Chang from NVIDIA - on the state of humanoid robots and how you (yes, YOU!) can build one of your own.
Here’s what I learned:
The $1.3 Trillion Problem
Right now, the world is losing approximately $1.3 trillion every year because there simply aren’t enough humans to fill critical roles, especially in industries like leisure & hospitality, healthcare, and construction.
But what if humanoid robots could fill that gap?
Humanoids are Getting Affordable
Humanoid robots used to be insanely expensive - NASA’s Robonaut cost around $1.5 million in 2001! Today, Unitree's G1 is just $40,000, making humanoid robots increasingly accessible to everyone.
Jensen and the Humanoid Lineup
Jensen Huang walked us through a stunning lineup of the latest humanoid robots. Yet, amidst these cutting-edge machines, Jensen humorously reminded us that the biological humanoid (Jensen himself!) remains the smartest - at least for now.
Meet GROOT: The Next Generation of Humanoids
NVIDIA introduced GROOT, a powerful new architecture designed specifically for humanoid robots. It involves three critical stages:
OVX to Generate tokens (data creation in Isaac Lab)
DGX to Learn tokens (model training)
AGX to Deploy tokens (real-world application)
Unlike standard LLMs (like ChatGPT), robotics language models can't just "download actions" from the internet - they need physical grounding.
From Specialists to Specialized Generalists
We’re witnessing an evolution in robotics:
Specialist Robots: Single-purpose, limited tasks (factory arms).
Generalist Robot Models: Capable of multiple general tasks (humanoids).
Specialized Generalists: General-purpose robots fine-tuned for specialized tasks (e.g., humanoids tailored for elder care).
Why Humanoids?
There are four main reasons humanoids are becoming crucial:
Versatility: Human-shaped robots handle diverse tasks effortlessly.
Costs: Decreasing hardware costs democratize robotics.
Brownfield: Humanoids can easily integrate into existing human-centric environments.
Data: Humanoids leverage massive internet-scale, human-centered datasets.
GROOT N1: The World's First Open Humanoid Foundation Model
NVIDIA unveiled GROOT N1 - a groundbreaking open foundation model for humanoid robots:
Input: Visual and language instructions.
Output: Precise motor actions, enabling complex tasks like packing groceries or sorting items.
Cross-Embodiment and Easy Entry
GROOT N1 supports cross-embodiment capabilities (one brain powering different physical robots), allowing tasks involving multiple robots working together seamlessly.
Best part? You can get started today. GROOT N1 is already available on Hugging Face, running smoothly on an affordable $110 LeRobot S0100 robot arm!
Data Pyramid for Humanoids
The data used to train these robots is structured in layers:
Real-world Data: Expensive, limited, but very accurate (human demonstrations).
Synthetic Data: Infinite potential via simulation (NVIDIA Omniverse), but hard to scale practically.
Web Data: Unlimited internet data from YouTube, Reddit, Wikipedia - passive but immense in scope.
This layered approach allows robots to leverage vast amounts of data efficiently.
Training Robots: Sim-and-Real Co-Training
NVIDIA employs a "sim-and-real" co-training strategy to accelerate learning:
Real-world Data (1x): Actual human demonstrations.
Synthetic Data (100x): Simulated "digital cousins" significantly expand the dataset.
This hybrid method dramatically reduces training time and cost, enabling rapid iteration and scaling of humanoid capabilities.
Humanoid Whole-Body Neural Control
OmniH2O & Hover neural controllers represent state-of-the-art full-body control frameworks, involving four stages:
Motion Retargeting: Adapts human motion datasets for robot feasibility.
Teacher Training: Privileged teacher reinforcement learning policy creation.
Student Distillation: Efficient learning of generalized robot policies.
Deployment: Seamless transition from simulation (Sim2Sim) to real-world (Sim2Real) performance.
This structured approach ensures versatile humanoid robot control.
NVIDIA Thor: The Ultimate Humanoid Robotics Platform
NVIDIA Jetson Thor emerges as the premier supercomputing platform for humanoid robotics, featuring:
2000 FP4 / 1000 FP8 TFLOPs powered by NVIDIA’s Blackwell GPU.
14-core Poseidon-AE ARM CPU for real-time sensor processing.
128GB memory supporting large-scale transformer models.
High-speed sensor processors (up to 4X 25GbE).
Set to be available by June 2025, Jetson Thor redefines robotic computing power.
Humanoid robots are no longer science fiction - they’re here, and they're becoming accessible. The cost of entry is lower than ever, the tech is open source, and the impact potential is enormous. Whether you're a developer, entrepreneur, or just a curious enthusiast, it's the perfect time to jump into building humanoid robots.
"The most exciting future isn’t something you predict, it’s something you build."
So why wait? Let’s start building.