← Back ICRA 2026

SPARR: Simulation-Based Policies with Asymmetric Real-World Residuals for Assembly

Yijie Guo, Iretiayo Akinola, Lars Johannsmeier, Hugo Hadfield, Abhishek Gupta, Yashraj Narang

PDF

AI summary

Key figure (auto-extracted from paper)

SPARR achieves near-perfect success rates in real-world robotic assembly by combining a simulation-trained base policy with a real-world vision-based residual policy, requiring zero human supervision.

Robotic assembly sim-to-real transfer residual policy learning reinforcement learning autonomous robotics contact-rich manipulation

Problem

Simulation-trained assembly policies degrade in real-world deployment due to the sim-to-real gap, while real-world reinforcement learning methods require heavy human supervision and lack generalization.

Approach

The method pre-trains a state-based policy in simulation, deploys it zero-shot to collect successful demonstrations, and then trains a vision-conditioned residual policy in the real world to correct for physical and visual discrepancies without human intervention.

Key results

Achieves 95–100% success rates across 10 diverse real-world assembly tasks
Improves success rates by 38.4% and reduces cycle time by 29.7% over state-of-the-art zero-shot methods
Eliminates the need for human demonstrations or supervision required by existing real-world RL approaches
Demonstrates robustness to pose estimation errors and physical variations via vision-based residual correction

Why it matters

Enables scalable, autonomous deployment of precise robotic assembly policies in industrial settings without costly human expertise or extensive real-world data collection.

Abstract

Robotic assembly presents a long-standing chal- lenge due to its requirement for precise, contact-rich ma- nipulation. While simulation-based learning has enabled the development of robust assembly policies, their performance often degrades when deployed in real-world settings due to the sim-to-real gap. Conversely, real-world reinforcement learning (RL) methods avoid the sim-to-real gap, but rely heavily on human supervision and lack generalization ability to environmental changes. In this work, we propose a hybrid approach that combines a simulation-trained base policy with a real-world residual policy to efficiently adapt to real-world variations. The base policy, trained in simulation using low- level state observations and dense rewards, provides strong priors for initial behavior. The residual policy, learned in the real world using visual observations and sparse rewards, compensates for discrepancies in dynamics and sensor noise. Extensive real-world experiments demonstrate that our method, SPARR, achieves near-perfect success rates across diverse two- part assembly tasks. Compared to the state-of-the-art zero- shot sim-to-real methods, SPARR improves success rates by 38.4% while reducing cycle time by 29.7%. Moreover, SPARR requires no human expertise, in contrast to the state-of-the- art real-world RL approaches that depend heavily on human supervision. Please visit the project webpage at https:// research.nvidia.com/labs/srl/projects/sparr/

Index terms

Reinforcement Learning Assembly