← Back ICRA 2026

HRI-DGDM: Dual-Graph Guided Diffusion Model for Uncertain Human Motion Modeling in HRI

Hongquan Gui, Ming Li

PDF

AI summary

Key figure (auto-extracted from paper)

HRI-DGDM accurately predicts uncertain human motion in human-robot interaction by guiding a diffusion process with dual structural and collaborative graphs.

Human-robot interaction motion prediction diffusion models dual-graph spatial-temporal modeling uncertainty quantification

Problem

Deterministic models fail to capture the inherent uncertainty and multi-modal nature of human motion in human-robot interaction, while existing diffusion models prioritize diversity over accuracy and lack mechanisms to model complex human-robot spatial-temporal dependencies.

Approach

The authors propose HRI-DGDM, which integrates a structural graph for kinematic priors and a dynamically learned collaboration graph into a spatial-temporal denoising network, guided by a masking mechanism that anchors observed history during diffusion.

Key results

Proposes HRI-DGDM, a dual-graph guided diffusion framework for HRI motion prediction
Designs a spatial-temporal denoising network with multi-scale adaptive graph fusion
Introduces a masking-based conditioning mechanism to anchor observed history and prevent drift
Demonstrates superior prediction accuracy over deterministic and diffusion baselines in HRI scenarios

Why it matters

Provides a robust, uncertainty-aware prediction framework that enhances safety and proactive adaptation in human-centered robotic applications.

Abstract

Human motion in human-robot interaction (HRI) is inherently uncertain, even when performing the same task repeatedly. This variability poses a significant challenge for prediction, as models must capture a distribution of plausible futures rather than a single deterministic trajectory. Traditional graph convolutional network based models, while effective at capturing spatial temporal dependencies, are fundamentally limited by their deterministic nature and struggle to represent this inherent motion uncertainty. To address this, diffusion models have emerged as a powerful framework for modeling uncertainty. However, their direct application to HRI is hindered by two key limitations: they often prioritize motion diversity over prediction accuracy, potentially generating physically implausible results, and they fail to adequately model the complex, multi-scale spatial temporal coupling between human and robot motions. To overcome these challenges, we propose HRI-DGDM, a HRI motion prediction framework based on a dual-graph guided diffusion model. Our method introduces a dual-graph structure—comprising a structural graph for kinematic priors and a collaboration graph learned from motion dynamics—to guide the denoising process with strong structural priors. A dedicated spatial temporal denoising network (STDN) fuses multi-scale features from both graphs through adaptive fusion and hierarchical spatial temporal modeling. Furthermore, a masking-based conditioning mechanism anchors the observed history during denoising, ensuring temporal consistency and preventing drift. Experiments on HRI scenarios demonstrate that HRI-DGDM outperforms baselines in prediction accuracy.

Index terms

Human Factors and Human-in-the-Loop Human-Robot Collaboration Human-Centered Robotics