← Back ICRA 2026

Extended Force and Velocity Prediction in Human-Robot Collaborative Transportation through Future Environment Representation Estimation

Jose Enrique Dominguez-Vidal

PDF

AI summary

Key figure (auto-extracted from paper)

Conditioning force and velocity predictions on estimated future environmental states extends accurate prediction horizons to 2 seconds in collaborative object transportation.

Human-Robot Collaboration Force Prediction Velocity Prediction RetNet Environmental Representation Dataset

Problem

Current predictors for collaborative object transportation are limited to short time horizons (≤1s) and ignore how surrounding environmental constraints shape human behavior and intent.

Approach

The authors enhance RetNet-based predictors by adding a module that forecasts future environmental states and uses cross-attention to condition the human force and velocity predictions.

Key results

Extended prediction horizon from 1s to 2s without accuracy loss
Achieved up to 90.4% force and 93.0% velocity success rates on test data
Validated in real-world experiments with up to 87.1% force and 91.3% velocity success rates
Publicly released a dataset of 17,400 sub-sequences from 120 volunteers

Why it matters

Enables safer, more responsive human-robot collaboration by allowing robots to anticipate human intent and navigate environmental constraints further into the future.

Abstract

In this work, we address the challenge of pre- dicting human-applied force and velocity during collaborative object transportation over extended distances (5–8 m). We enhance state-of-the-art predictors by refining their input data processing, which significantly improves prediction accuracy. Furthermore, we extend the temporal prediction horizon from 1 s to 2 s without compromising performance, by introducing an extra environmental prediction module that conditions force and velocity estimations based on anticipated sensory input. This integration captures the contextual dependency of human behaviour during joint transport. Experimental evaluations, both on dataset and in real-world settings, validate the effec- tiveness of our approach. Specifically, our best model manages to achieve success rates in testset of up to 90.4% in predicting the human’s exerted force and up to 93.0% in the velocity of the human-robot pair during the next 2 s, and up to 87.1% and 91.3% respectively in real experiments.

Index terms

Physical Human-Robot Interaction Intention Recognition Deep Learning Methods