← Back ICRA 2026

Multi-Modal Locomotion Mode Recognition in the Real World for Robotic Hip Complex Exoskeletons

Hyesoo Shin, Sangdo Kim, Sunwoo Kim, Jongwon Lee, Jinkyu Kim, KangGeon Kim

PDF

AI summary

Key figure (auto-extracted from paper)

Fusing mechanical and visual sensor data improves exoskeleton locomotion mode recognition accuracy by an average of 11.7% compared to single-sensor systems.

Wearable Robotics Sensor Fusion Locomotion Mode Recognition Multi-modal Learning Exoskeleton Control Real-world Dataset

Problem

Single-sensor locomotion mode recognition systems struggle to generalize across users and degrade when assistive torque alters joint kinematics, hindering safe and effective real-world exoskeleton control.

Approach

The authors propose a lightweight multi-modal recognition system that combines proprioceptive (mechanical/IMU) and visual (RGB camera) data using intermediate and late fusion strategies, evaluated on a novel outdoor dataset.

Key results

11.7% average accuracy improvement over single-modal baselines
Visual data enhances cross-user gait generalization
Mechanical data ensures high intra-class recognition consistency
Novel synchronized mechanical-visual outdoor dataset released

Why it matters

Enables more robust and adaptive real-time control for lower-limb exoskeletons, advancing wearable robotics for assistive healthcare applications.

Abstract

Lower limb exoskeletons assist users by supporting joint movements. Since joint motion patterns vary depending on how the user moves, accurately recognizing the type of movement (locomotion mode) is crucial for controlling the ex- oskeleton and ensuring user safety. Inspired by how humans use multiple types of sensory information to control movement, we developed a multi-modal locomotion mode recognition (LMR) system that uses both mechanical and visual sensor data to identify locomotion modes. Our approach utilizes two fusion methods: intermediate fusion, which combines the data in the form of features, and late fusion, which integrates the sensor data by averaging the recognition results from each sensor. By fusing these two different modalities, the prediction accuracy improved by an average of 11.7% with the test data. Through comparisons with uni-modal LMR systems that rely on a single type of sensor data for locomotion mode recognition, we found that the improved performance of the multi-modal LMR system is due to the visual information’s ability to generalize different gait patterns across users and the mechanical sensor data’s consistency within the same classes.

Index terms

Wearable Robotics Sensor Fusion Embedded Systems for Robotic and Automation