← Back ICRA 2026

COLSON: Controllable Learning-Based Social Navigation Via Diffusion-Based Reinforcement Learning

Kohei Matsumoto, Yuki Tomita, Yuki Hyodo, Ryo Kurazume

PDF

AI summary

Key figure (auto-extracted from paper)

Diffusion-based reinforcement learning enables highly adaptable, collision-free social navigation for mobile robots in dynamic environments without retraining.

Diffusion models Reinforcement learning Social navigation Robot guidance Graph neural networks Zero-shot adaptation

Problem

Traditional deep reinforcement learning methods for robot social navigation rely on fixed Gaussian action distributions, limiting their flexibility and ability to adapt to unseen environments or novel tasks.

Approach

The authors propose COLSON, a diffusion-based reinforcement learning framework integrated with graph neural networks that uses Q-score matching and a novel annealing schedule to generate diverse, controllable actions via gradient-based guidance.

Key results

First application of diffusion-based RL to social navigation, outperforming Gaussian policy-based baselines
Novel annealing technique for Q-score matching improves training stability and final performance
Zero-shot adaptation to unseen static obstacles and companion tasks via action guidance without retraining
High success rates and low collision rates across varying pedestrian densities, validated in real-world demonstrations

Why it matters

Provides a scalable, flexible navigation framework for autonomous service robots operating in complex, dynamic real-world settings.

Abstract

Navigation of mobile robots in dynamic environ- ments with pedestrian traffic poses a significant challenge in the development of autonomous mobile service robots. Recently, deep reinforcement learning-based methods have been actively studied and have outperformed traditional rule-based approaches, owing to their optimization capabilities. Among these methods, those assuming continuous action spaces typi- cally use Gaussian distributions, limiting the flexibility of action generation. By contrast, the application of diffusion models to reinforcement learning has advanced, allowing more flexible action distributions than Gaussian policy-based approaches. In this study, we used a diffusion-based reinforcement learning approach to social navigation and validated its effectiveness. Furthermore, using the characteristics of diffusion models, we propose extensions that allow adaptation to previously unseen scenarios without additional training. As concrete scenario examples, we show adaptability to scenarios in which static obstacles exist in an environment that was not present during training, as well as scenarios in which the objective differs from training, such as accompanying a target pedestrian while avoiding other pedestrians to reach a destination.

Index terms

Human-Aware Motion Planning Reinforcement Learning Motion and Path Planning