← Back ICRA 2026

Risk-Aware Routing for a Robot in a Shared Dynamic Environment

Elena Stracca, Giorgio Grioli, Lucia Pallottino, Paolo Salaris

PDF

AI summary

Key figure (auto-extracted from paper)

An MDP-based routing policy that incorporates real-time observations and encounter severity significantly reduces performance degradation compared to reactive and state-of-the-art planners.

risk-aware routing mobile robot navigation Markov decision process stochastic shortest path human-robot interaction online policy adaptation

Problem

Mobile robots navigating dynamic, human-shared spaces often experience performance delays due to uncertain human presence, but existing planners fail to optimally adapt routes based on real-time observations.

Approach

The problem is modeled as a stochastic shortest path with recourse and solved via a Markov Decision Process to compute an offline policy that accounts for human distribution probabilities and encounter severity, followed by an online adaptation mechanism to prevent routing loops.

Key results

MDP policy explicitly integrates re-planning options, visibility, and encounter severity
Generated routes outperform reactive and advanced state-of-the-art planners
Low computational complexity enables scalability to large environments
Online adaptation mechanism successfully prevents cyclic routing behaviors

Why it matters

Enables safer and more efficient deployment of autonomous mobile robots in dynamic human-robot shared spaces like warehouses and hospitals.

Abstract

This paper explores the challenge of optimal routing for a mobile robot navigating a dynamic and shared human environment. The primary goal is to minimize the risk of performance degradation during motion, such as delays in completing tasks due to the need for safe or acceptable human- robot encounters. The problem is formulated as a graph whose edge costs become progressively known only as the robot moves through the environment. We model this problem as a Markov Decision Process (MDP), enabling an offline evaluation of the expected cost of alternative routes based on statistical information about human spatial distributions and possible observations at each intersection. This compact state representation scales linearly with the number of intersections in the map. Since the memoryless property of the MDP may induce loops during online execution, we compute an offline policy and introduce an online policy adaptation mechanism to prevent cyclic behaviors. Exten- sive simulations across environments of different complexity, and using data collected from real-world experiments, demonstrate that our approach outperforms reactive and advanced state-of- the-art planners in terms of either performance or scalability.

Index terms

Autonomous Vehicle Navigation Reactive and Sensor-Based Planning Motion and Path Planning Stochastic Shortest Path With Recourse