Multi-Modal Locomotion Mode Recognition in the Real World for Robotic Hip Complex Exoskeletons
Hyesoo Shin, Sangdo Kim, Sunwoo Kim, Jongwon Lee, Jinkyu Kim, KangGeon Kim
AI summary
Problem
Single-sensor locomotion mode recognition systems struggle to generalize across users and degrade when assistive torque alters joint kinematics, hindering safe and effective real-world exoskeleton control.
Approach
The authors propose a lightweight multi-modal recognition system that combines proprioceptive (mechanical/IMU) and visual (RGB camera) data using intermediate and late fusion strategies, evaluated on a novel outdoor dataset.
Key results
- 11.7% average accuracy improvement over single-modal baselines
- Visual data enhances cross-user gait generalization
- Mechanical data ensures high intra-class recognition consistency
- Novel synchronized mechanical-visual outdoor dataset released
Why it matters
Enables more robust and adaptive real-time control for lower-limb exoskeletons, advancing wearable robotics for assistive healthcare applications.
Abstract
Lower limb exoskeletons assist users by supporting joint movements. Since joint motion patterns vary depending on how the user moves, accurately recognizing the type of movement (locomotion mode) is crucial for controlling the ex- oskeleton and ensuring user safety. Inspired by how humans use multiple types of sensory information to control movement, we developed a multi-modal locomotion mode recognition (LMR) system that uses both mechanical and visual sensor data to identify locomotion modes. Our approach utilizes two fusion methods: intermediate fusion, which combines the data in the form of features, and late fusion, which integrates the sensor data by averaging the recognition results from each sensor. By fusing these two different modalities, the prediction accuracy improved by an average of 11.7% with the test data. Through comparisons with uni-modal LMR systems that rely on a single type of sensor data for locomotion mode recognition, we found that the improved performance of the multi-modal LMR system is due to the visual information’s ability to generalize different gait patterns across users and the mechanical sensor data’s consistency within the same classes.