A Lightweight Physics-Informed Neural Network for Sim-To-Real of Biped Robot
Yan Liu, XiZhe Zang, Xuehe Zhang, Chao Song, Boyang Chen, Jie Zhao
AI summary
Problem
Discrepancies between simulated and physical robot dynamics create a sim-to-real gap that hinders policy transfer, while existing solutions either require extensive real-world data, complex system identification, or suffer from high computational costs.
Approach
The authors train a compact physics-informed neural network with an LSTM on simulation trajectories to predict next-step joint states, which are then fed into a feedforward-plus-feedback control loop to correct the physical robot's tracking errors in real time.
Key results
- Predicts next-step joint states at 1 kHz on embedded hardware
- Outperforms direct policy transfer in tracking accuracy and behavior reproduction
- Requires only simulation data, eliminating motion capture or real-world measurements
- Delivers superior real-time performance compared to existing learning-based dynamics models
Why it matters
Provides a low-cost, computationally efficient pathway for deploying robust biped locomotion policies on physical hardware, accelerating humanoid robot development.
Abstract
n this paper, we present a low-cost, easy-to- implement sim-to-real framework for biped locomotion that narrows the reality gap using only simulation data, without motion-capture or additional real-world measurements.n this paper, we present a low-cost, easy-to-implement sim-to-real framework for biped locomotion that narrows the reality gap using only simulation data, without motion-capture or additional real-world measurements.I First, a walking policy for the BRUCE robot is trained in Isaac Gym via reinforce- ment learning. Next, we develop a compact, physics-informed neural network (PINN) grounded in Euler-Lagrange structure and augmented with an LSTM to predict simulator forward dynamics. Trained solely on simulation trajectories, the PINN forecasts next-step joint angles and velocities of the simulated robot given the physical robot’s current state and control inputs. During hardware deployment, and consistent with a whole-body control architecture, these predicted states serve as reference joint states while the policy outputs provide feedforward torque commands; a feedforward-plus-feedback torque controller then computes the executed joint torques, thereby reducing the sim-to-real gap. Experiments on BRUCE demonstrate that our method better reproduces simulated behavior and attains higher tracking accuracy than direct policy transfer. Furthermore, the dynamics predictor runs at 1 kHz on embedded hardware, showing superior real-time performance relative to existing learning-based models.