Research Analyzer
← Back ICRA 2026

SoftMimic: Learning Compliant Whole-body Control from Examples

Gabriel Margolis, Michelle Wang, Nolan Fey, Pulkit Agrawal

PDF

AI summary

Key figure (auto-extracted from paper)
SoftMimic enables humanoids to safely absorb unexpected forces and generalize from a single motion clip by learning compliant whole-body responses instead of rigidly tracking reference poses.
Compliant control Humanoid robotics Reinforcement learning Motion imitation Whole-body control Sim-to-real transfer

Problem

Current motion-tracking policies for humanoids are overly stiff, causing brittle and unsafe behavior when encountering unexpected contacts or disturbances in real-world environments.

Approach

The framework uses an inverse kinematics solver to generate a dataset of feasible, compliant motion examples across various forces and stiffness levels, then trains a reinforcement learning policy to imitate these compliant responses while tracking the original reference.

Key results

  • Significantly reduces peak collision forces compared to stiff baselines
  • Generalizes a single reference motion to handle varying object sizes and misalignments
  • Enables real-time modulation of interaction stiffness at deployment
  • Demonstrates safe, compliant physical interaction on a real Unitree G1 humanoid

Why it matters

It enables safe, adaptable physical interaction for humanoids, accelerating their deployment in unstructured, human-populated environments.

Abstract

We introduce SoftMimic, a framework for learning compliant whole-body control policies for humanoid robots from example motions. Imitating human motions with rein- forcement learning allows humanoids to quickly learn new skills, but existing methods incentivize stiff control that ag- gressively corrects deviations from a reference motion, leading to brittle and unsafe behavior when the robot encounters unexpected contacts. In contrast, SoftMimic enables robots to respond compliantly to external forces while maintaining bal- ance and posture. Our approach leverages an inverse kinematics solver to generate an augmented dataset of feasible compliant motions, which we use to train a reinforcement learning policy. By rewarding the policy for matching compliant responses rather than rigidly tracking the reference motion, SoftMimic learns to absorb disturbances and generalize to varied tasks from a single motion clip. We validate our method through simulations and real-world experiments, demonstrating safe and effective interaction with the environment.

Index terms

Human and Humanoid Motion Analysis and Synthesis Whole-Body Motion Planning and Control Reinforcement Learning

Related papers