Voice-Driven Assistance and Resistance Modulation in a Soft Hip Exosuit Using a Transformer-Based Speech Recognition Model
Enrica Tricomi, Daniel Lindner, Xiaohui Zhang, Luka Miskovic, Lorenzo Masia
AI summary
Problem
Current wearable assistive devices lack intuitive, real-time user control for switching between support modes and adjusting intensity, often relying on pre-programmed strategies that limit personalization and user agency.
Approach
The system integrates a locally embedded transformer-based speech recognition model with a gait-phase estimator to decode short verbal commands and dynamically adjust exosuit actuation in sync with the user's walking cycle.
Key results
- High voice command recognition accuracy (95–100%) with ~9 ms latency
- Assistive commands reduced walking metabolic cost by up to 20.9%
- Resistive commands increased metabolic cost by up to 14.9%
- Real-time, user-driven switching between assistance and resistance modes synchronized with gait phase
Why it matters
Enables intuitive, real-time personalization of wearable robotic support, advancing the usability and adaptability of assistive exosuits for diverse user needs.
Abstract
Intuitive human–robot interfaces are essential to increase usability and personalization in wearable robotic assistive technologies. However, most current systems rely on pre-programmed or sensor-driven strategies that offer limited active user control online. To address this limitation, we present a voice-driven control framework for a soft hip exosuit, enabling on-demand modulation of assistance and resistance via short spoken commands. The system combines a fully embedded transformer-based automatic speech recognition model (Whis- per) with a gait-phase estimator to synchronize actuation with the user’s motion. Users can switch between assistive and resistive modes and select discrete gain levels (low, medium, high). Experiments with six healthy participants demonstrate high recognition accuracy (95-100%) and low latency (∼9 ms). Metabolic measurements show that assistive commands reduced walking energy cost by 20.9±4.8% (LOW) and 9.7±5.5% (MEDIUM) relative to baseline, while resistive commands increased cost by 13.1±3.5% (MEDIUM) and 14.9±5.1% (HIGH). These results highlight the feasibility of intuitive, voice- driven modulation in wearable robotics.