A Distributed Multi-Modal Sensing Approach for Human Activity Recognition in Real-Time Human-Robot Collaboration
Van Anh Ho, Fulvio Mastrogiovanni
AI summary
Problem
Existing human activity recognition systems struggle to simultaneously capture complex hand kinematics and contact forces in dynamic, real-world human-robot collaboration settings without suffering from occlusion, sensor drift, or latency.
Approach
The authors fuse motion data from a modular IMU data glove with tactile feedback from a vision-based cylindrical sensor using a late-fusion neural network, enabling real-time classification of hand activities during physical interaction.
Key results
- 94.64% accuracy and 95.60% F1-score in offline classification of 15 distinct hand actions
- Robust real-time online classification validated with event-based error metrics under static conditions
- Successful dynamic validation where the robot adaptively adjusted its trajectory based on recognized human gestures
- Demonstrated viability of late-fusion multi-modal sensing for safe, responsive physical collaboration
Why it matters
Provides a scalable, occlusion-resilient sensing framework that enables robots to safely interpret and dynamically respond to human physical intentions during collaborative tasks.
Abstract
Human activity recognition (HAR) is fundamen- tal in human-robot collaboration (HRC), enabling robots to respond to and dynamically adapt to human intentions. This paper introduces a HAR system combining a modular data glove equipped with Inertial Measurement Units and a vision- based tactile sensor to capture hand activities in contact with a robot. We tested our activity recognition approach under different conditions, including offline classification of segmented sequences, real-time classification under static conditions, and a realistic HRC scenario. The experimental results show a high accuracy for all the tasks, suggesting that multiple collaborative settings could benefit from this multi-modal approach.