Robotics and Edge AI
Skild AI
Skild AI, powered by NVIDIA’s accelerated computing infrastructure, has developed a novel technique to train an omni-bodied robot foundation model capable of adapting to new robot embodiments and performing new skills with zero or minimal post-training. The company uses NVIDIA Omniverse™ libraries and open frameworks such as NVIDIA Isaac™ Lab for advanced physics simulation and NVIDIA Cosmos™ for data augmentation and generation to train its foundation model.
Key Takeaways
For years, robotics has struggled with the same intractable problem: how to build robots capable of thousands of tasks across thousands of environments and across a variety of morphologies. While artificial intelligence has achieved remarkable success in language and vision through the simple recipe of large datasets, big networks, and GPU training, reliable physical AI that understands physics and spatial relationships and outputs the correct motor commands poses new challenges.
Unlike domains with abundant internet data, robotics has suffered from critical data scarcity. Real-world data collection on physical robots is slow and expensive—operating robots for data collection takes minutes to generate a single high-quality demonstration, but AI systems need billions of training samples to be effective. Robots without sufficient training data to perform reliably can’t be deployed at scale to capture new operational data to train more complex skills. This limitation has kept robotics locked in a state of impressive demonstrations with limited real-world deployment success.
Skild AI built a true robotics foundation model called the Skild Brain. Unlike other robotics models that are overfit to specific types of robots, the Skild Brain is omni-bodied, meaning it can control any robot, even without knowing its exact body. Like a human brain, it has a high-level decision-maker that determines what the robot should do (like “pick up that cup”) and a low-level controller that handles the precise muscle movements needed to execute those commands.
To overcome the data shortage, Skild AI leverages two alternative data sources: physics-based synthetic data generation and human videos from the internet. Unlike real-world teleoperated data collection, these sources are almost infinitely scalable. Simulations can be scaled by duplicating them across more GPUs, while there’s a huge, constantly growing dataset of videos available on the internet.
The company’s key breakthrough is models that adapt via in-context learning. By analyzing when actions don’t work as expected, the robots develop what resembles intuition, adjusting their behavior based on different environments. This enables robots to operate dynamically in complex environments, without requiring preprogrammed instructions for each scenario.
Skild AI uses Isaac Lab to create the simulation training scenarios necessary for reinforcement learning development of robots across challenging conditions. The company leverages Cosmos Transfer to augment training datasets with environmental variations, expanding the scope and robustness of neural training data. This multi-pronged simulation approach enables Skild AI to acquire a millennium of experience within days, making large-scale robotic training feasible at unprecedented speed.
Skild AI created massive-scale simulations with thousands of robot instances across multiple embodiments—including humanoids, quadrupeds, and robotic arms—each with distinct morphologies and deployed across thousands of environments to maximize generalization. This synthetic data generation training powers an omni-bodied brain, preventing the AI model from memorizing solutions for specific hardware configurations and instead forcing it to develop in-context learning strategies that work universally across all robot types.
Synthetic data generation through advanced simulation represents a core pillar of Skild AI’s technology stack. The company generates billions of training examples through physics-based simulation, enabling robots to experience failure scenarios safely and extensively.
This is essential because robots have countless ways to fail compared to the limited ways they can succeed, making it impossible to capture all failure scenarios through traditional data collection. With Cosmos Transfer, Skild AI is able to augment and multiply datasets via text prompts, generating varied environmental conditions, lighting scenarios, and visual features to maximize training robustness. Simulation allows robots to experience millions of failures in diverse environments safely before mastering the correct approach, building the robustness needed for real-world deployment.
The model demonstrates remarkable adaptability to mechanical changes, recovering from jammed wheels within 2–3 seconds and broken legs after several attempts rather than experiencing failure. This resilience extends to extreme scenarios, including walking on stilts with extended leg-to-body ratios that exceed training parameters, and is a form of zero-shot learning that showcases true generalization capabilities.
The second key part is learning from human videos. To capture the diversity of the real world, Skild AI leverages the trillions of videos available online showing humans performing various tasks across platforms. By treating humans as biological robots, the company developed advanced techniques to extract affordances—helping the robot brain understand how objects should be manipulated by observing human interactions.
NVIDIA’s AI computing infrastructure powers the massive computational requirements for training robotics foundation models across multiple data modalities simultaneously. Together, NVIDIA’s accelerated computing and simulation libraries and frameworks create the foundational infrastructure that enables Skild AI to achieve breakthrough results with cost-effective hardware, developing robots that cost $4,000–$15,000 compared to traditional robotic systems that require $250,000+ investments.
Skild has released recent results showcasing the capabilities of the omni-bodied brain in various scenarios.
End-to-End Locomotion From Vision
The Skild Brain enables end-to-end locomotion control driven entirely by online vision and proprioception. From raw camera images and joint feedback, the model directly outputs low-level motor commands, enabling humanoid robots to walk on flat ground and climb high obstacles. The robots maintain remarkable agility even while carrying payloads such as packages in their hands.
Testing in Pittsburgh’s challenging urban environment, Skild AI’s humanoid robots demonstrated practical capabilities, achieving 60%–80% task performance within hours of data collection. The robots successfully performed complex manipulation tasks while remaining robust to human interference and environmental variations. They’ve been tested through city parks and streets, up fire escapes, and over obstacles in environments they had never seen before, all without prior planning or mapping.
Precise, Reliable Manipulation
Automating real-world tasks requires a high degree of precision and reliability. Skild AI showcased the brain’s ability to automate several useful tasks, such as cleaning up a home office desk and inserting AirPods into cases—a task currently carried out by humans over thousands of hours each day.
Extreme Adaptation
Skild showcased the ability of the brain to adapt to extreme scenarios, such as the loss of one or more limbs. In this case, the brain uses in-context learning and interacts with the environment to recover.
Skild AI is developing general-purpose intelligence that adapts across different robotic platforms. The company is focused on scaling to create a single action-centric brain for all robot embodiments, all tasks, and all scenarios—designed uniquely for physical AI applications.
Skild AI’s work demonstrates that the future of robotics lies not in collecting more robot data, but in intelligently leveraging the vast amounts of simulation and human behavioral data already available, processed through advanced AI systems capable of continuous real-world adaptation.
“Learning by experience, and not pre-programming, is the step change that has happened in robotics. NVIDIA Isaac Lab and Cosmos technologies allow us to create massive and scalable data sources necessary for robots to truly learn from experience across diverse scenarios and embodiments.”
Deepak Pathak
Skild AI Cofounder and CEO
Explore the NVIDIA Isaac open robotics platform to accelerate your end-to-end robotics development, simulation, synthetic data generation, and robot learning frameworks that enable training at unprecedented scale.