Learning to Design Soft Hands Using Reward Models
Xueqian Bai, Nicklas Hansen, Adabhav Singh, Michael T. Tolley, Yan Duan, Pieter Abbeel, Xiaolong Wang, Sha Yi
AI summary
Problem
Designing functional soft robotic hands is hindered by high-dimensional parameter spaces and the computational expense of simulating each design candidate. Traditional co-design methods struggle with scalability and generalization across varied tasks.
Approach
The authors introduce CEM-RM, which uses pre-collected teleoperation data to train a neural reward model that approximates simulation outcomes, allowing the Cross-Entropy Method to efficiently explore and refine hand geometry and tendon routing distributions.
Key results
- Reduces simulation budget by over 50% compared to pure optimization
- Achieves higher grasping success rates on diverse in-domain and out-of-domain objects
- Accelerates optimization convergence while maintaining evaluation fidelity
- Successfully validates optimized designs via 3D printing and real-world teleoperation
Why it matters
Offers a practical, data-driven pipeline for rapid soft robot prototyping that reduces reliance on costly trial-and-error and expert modeling.
Abstract
Soft robotic hands promise to provide compliant and safe interaction with objects and environments. However, designing soft hands to be both compliant and functional across diverse use cases remains challenging. Although co- design of hardware and control better couples morphology to behavior, the resulting search space is high-dimensional, and even simulation-based evaluation is computationally expensive. In this paper, we propose a Cross-Entropy Method with Reward Model (CEM-RM) framework that efficiently optimizes tendon- driven soft robotic hands based on teleoperation control policy, reducing design evaluations by more than half compared to pure optimization while learning a distribution of optimized hand designs from pre-collected teleoperation data. We derive a design space for a soft robotic hand composed of flexural soft fingers and implement parallelized training in simulation. The optimized hands are then 3D-printed and deployed in the real world using both teleoperation data and real-time teleoperation. Experiments in both simulation and hardware demonstrate that our optimized design significantly outperforms baseline hands in grasping success rates across a diverse set of challenging objects. More info: https://hakuna25.github.io/sof thand/