Refinery: Active Fine-Tuning and Deployment-Time Optimization for Contact-Rich Policies
Bingjie Tang, Iretiayo Akinola, Jie Xu, Bowen Wen, Dieter Fox, Gaurav Sukhatme, Fabio Ramos, Abhishek Gupta, Yashraj Narang
AI summary
Problem
Current simulation-based policies for contact-rich robotic assembly achieve only ~80% success rates due to high performance variance across initial conditions, making them unreliable for industrial use and brittle for multi-part chaining.
Approach
Refinery uses Bayesian Optimization to actively fine-tune policies on high-uncertainty initial states, then employs Gaussian Mixture Models to select high-success initializations during deployment.
Key results
- 10.98% mean success rate improvement on 2-part assembly benchmarks (reaching 91.51%)
- 97% real-world success rate with zero-shot sim-to-real transfer
- First zero-shot sim-to-real demonstration for long-horizon multi-part assembly
- GMM deployment sampling significantly outperforms uniform initialization
Why it matters
Enables research-grade simulation policies to meet industrial reliability thresholds for complex, long-horizon robotic assembly tasks.
Abstract
Simulation-based learning has enabled policies for precise, contact-rich tasks (e.g., robotic assembly) to reach high success rates (∼80%) under high levels of observation noise and control error. Although such performance may be sufficient for research applications, it falls short of industry standards and makes policy chaining exceptionally brittle. A key limi- tation is the high variance in individual policy performance across diverse initial conditions. We introduce Refinery, an effective framework that bridges this performance gap, robus- tifying policy performance across initial conditions. We propose Bayesian Optimization-guided fine-tuning to improve individual policies, and Gaussian Mixture Model-based sampling during deployment to select initializations that maximize execution success. Using Refinery, we improve mean success rates by 10.98% over state-of-the-art methods in simulation-based learn- ing for robotic assembly, reaching 91.51% in simulation and comparable performance in the real world. Furthermore, we demonstrate that these fine-tuned policies can be chained to accomplish long-horizon, multi-part assembly—successfully assembling up to 8 parts without requiring explicit multi-step training. See our project website for more details.