Research Analyzer
← Back ICRA 2026

HHI-Assist: A Dataset and Benchmark of Human-Human Interaction in Physical Assistance Scenario

Saeed Saadatnejad, Reyhaneh Hosseininejad, Jose Barreiros, Katherine Tsui, Alexandre Alahi

PDF

AI summary

Key figure (auto-extracted from paper)
An interaction-aware diffusion model accurately predicts coupled human poses in physical assistance tasks, outperforming baselines and generalizing to unseen scenarios.
Physical Human-Robot Interaction Motion Prediction Diffusion Models Assistive Robotics Human-Human Interaction Pose Estimation

Problem

Safe physical assistance requires robots to anticipate coupled human motions, but predicting these interactions is hindered by a lack of close-contact datasets and models that account for reciprocal dynamics.

Approach

The authors collected a motion-capture dataset of human-human assistive tasks and developed a conditional denoising diffusion model that predicts future poses by jointly conditioning on both agents' observed movements.

Key results

  • HHI-Assist dataset of 908 human-human physical assistance clips
  • Interaction-aware denoising diffusion model for coupled pose prediction
  • Improved prediction accuracy over standard baselines
  • Strong generalization to unseen assistive scenarios

Why it matters

Provides essential data and predictive capabilities to advance safe, responsive assistive robots for aging populations and care automation.

Abstract

The increasing labor shortage and aging population underline the need for assistive robots to support human care recipients. To enable safe and responsive assistance, robots require accurate human motion prediction in physical interaction scenarios. However, this remains a challenging task due to the variability of assistive settings and the complexity of coupled dynamics in physical interactions. In this work, we address these challenges through two key contributions: (1) HHI-Assist, a dataset comprising motion capture clips of human-human interactions in assistive tasks; and (2) a conditional Transformer- based denoising diffusion model for predicting the poses of interacting agents. Our model effectively captures the coupled dynamics between caregivers and care receivers, demonstrating improvements over baselines and strong generalization to unseen scenarios. By advancing interaction-aware motion prediction and introducing a new dataset, our work has the potential to signifi- cantly enhance robotic assistance policies. The dataset and code are available at: https://sites.google.com/view/hhi-assist/home.

Index terms

Data Sets for Robot Learning Physical Human-Robot Interaction Intention Recognition

Related papers