← Back ICRA 2026

Voice-Driven Assistance and Resistance Modulation in a Soft Hip Exosuit Using a Transformer-Based Speech Recognition Model

Enrica Tricomi, Daniel Lindner, Xiaohui Zhang, Luka Miskovic, Lorenzo Masia

PDF

AI summary

Key figure (auto-extracted from paper)

Voice commands reliably modulate assistance and resistance in a soft hip exosuit in real time, significantly reducing walking energy cost during assistance and increasing it during resistance.

Voice-driven control Soft hip exosuit Speech recognition Adaptive assistance Wearable robotics Metabolic cost

Problem

Current wearable assistive devices lack intuitive, real-time user control for switching between support modes and adjusting intensity, often relying on pre-programmed strategies that limit personalization and user agency.

Approach

The system integrates a locally embedded transformer-based speech recognition model with a gait-phase estimator to decode short verbal commands and dynamically adjust exosuit actuation in sync with the user's walking cycle.

Key results

High voice command recognition accuracy (95–100%) with ~9 ms latency
Assistive commands reduced walking metabolic cost by up to 20.9%
Resistive commands increased metabolic cost by up to 14.9%
Real-time, user-driven switching between assistance and resistance modes synchronized with gait phase

Why it matters

Enables intuitive, real-time personalization of wearable robotic support, advancing the usability and adaptability of assistive exosuits for diverse user needs.

Abstract

Intuitive human–robot interfaces are essential to increase usability and personalization in wearable robotic assistive technologies. However, most current systems rely on pre-programmed or sensor-driven strategies that offer limited active user control online. To address this limitation, we present a voice-driven control framework for a soft hip exosuit, enabling on-demand modulation of assistance and resistance via short spoken commands. The system combines a fully embedded transformer-based automatic speech recognition model (Whis- per) with a gait-phase estimator to synchronize actuation with the user’s motion. Users can switch between assistive and resistive modes and select discrete gain levels (low, medium, high). Experiments with six healthy participants demonstrate high recognition accuracy (95-100%) and low latency (∼9 ms). Metabolic measurements show that assistive commands reduced walking energy cost by 20.9±4.8% (LOW) and 9.7±5.5% (MEDIUM) relative to baseline, while resistive commands increased cost by 13.1±3.5% (MEDIUM) and 14.9±5.1% (HIGH). These results highlight the feasibility of intuitive, voice- driven modulation in wearable robotics.

Index terms

Wearable Robotics Physically Assistive Devices Modeling Control and Learning for Soft Robots