← Back ICRA 2026

Learning to Design Soft Hands Using Reward Models

Xueqian Bai, Nicklas Hansen, Adabhav Singh, Michael T. Tolley, Yan Duan, Pieter Abbeel, Xiaolong Wang, Sha Yi

PDF

AI summary

Key figure (auto-extracted from paper)

Combining a learned reward model with the Cross-Entropy Method cuts simulation costs by half while producing soft robotic hand designs that grasp diverse objects more reliably than baselines.

Soft robotic hands Design optimization Reward models Cross-entropy method Teleoperation Co-design

Problem

Designing functional soft robotic hands is hindered by high-dimensional parameter spaces and the computational expense of simulating each design candidate. Traditional co-design methods struggle with scalability and generalization across varied tasks.

Approach

The authors introduce CEM-RM, which uses pre-collected teleoperation data to train a neural reward model that approximates simulation outcomes, allowing the Cross-Entropy Method to efficiently explore and refine hand geometry and tendon routing distributions.

Key results

Reduces simulation budget by over 50% compared to pure optimization
Achieves higher grasping success rates on diverse in-domain and out-of-domain objects
Accelerates optimization convergence while maintaining evaluation fidelity
Successfully validates optimized designs via 3D printing and real-world teleoperation

Why it matters

Offers a practical, data-driven pipeline for rapid soft robot prototyping that reduces reliance on costly trial-and-error and expert modeling.

Abstract

Soft robotic hands promise to provide compliant and safe interaction with objects and environments. However, designing soft hands to be both compliant and functional across diverse use cases remains challenging. Although co- design of hardware and control better couples morphology to behavior, the resulting search space is high-dimensional, and even simulation-based evaluation is computationally expensive. In this paper, we propose a Cross-Entropy Method with Reward Model (CEM-RM) framework that efficiently optimizes tendon- driven soft robotic hands based on teleoperation control policy, reducing design evaluations by more than half compared to pure optimization while learning a distribution of optimized hand designs from pre-collected teleoperation data. We derive a design space for a soft robotic hand composed of flexural soft fingers and implement parallelized training in simulation. The optimized hands are then 3D-printed and deployed in the real world using both teleoperation data and real-time teleoperation. Experiments in both simulation and hardware demonstrate that our optimized design significantly outperforms baseline hands in grasping success rates across a diverse set of challenging objects. More info: https://hakuna25.github.io/sof thand/

Index terms

Modeling Control and Learning for Soft Robots Deep Learning in Grasping and Manipulation Telerobotics and Teleoperation