Research Analyzer
← Back ICRA 2026

Anytime Probabilistically Constrained Provably Convergent Online Belief Space Planning

Andrey Zhitnikov, Vadim Indelman

PDF

AI summary

Key figure (auto-extracted from paper)
Pruning unsafe actions and cleaning search tree statistics in MCTS guarantees anytime safety and better objective performance without waiting for convergence.
MCTS POMDP Belief Space Planning Anytime Safety Probabilistic Constraints Continuous Domains

Problem

Existing online planning methods for partially observable environments either ensure safety only at infinite convergence time, allow dangerous actions to corrupt search statistics, or rely on slow learning components in continuous domains.

Approach

The authors propose an anytime Monte Carlo Tree Search algorithm that directly prunes unsafe actions from the belief tree and revises node statistics to remove their influence, enabling safe planning in continuous spaces with general probabilistic constraints.

Key results

  • Anytime safety guarantees with respect to the expanded belief tree
  • Proof of convergence in probability with an exponential rate
  • Demonstration that cleaning pruned actions prevents objective degradation
  • Superior safety and objective performance over baselines with minimal tree queries

Why it matters

Enables reliable, safe autonomous decision-making for robots under uncertainty without sacrificing computational efficiency or waiting for algorithmic convergence.

Abstract

Taking into account future risk is essential for an autonomously operating robot to find online not only the best but also a safe action to execute. In this paper, we build upon the recently introduced formulation of probabilistic belief-dependent constraints. In our methodology safety can be materialized with any general belief-dependent operator we call payoff. We present an anytime approach employing the Monte Carlo Tree Search (MCTS) method in continuous domains in terms of states, actions and observations and general-belief dependent reward and payoff operators. Unlike previous approaches, our method ensures safety anytime with respect to the currently expanded search tree without relying on the convergence of the search. We prove convergence in probability with an exponential rate of a version of our algorithms and study proposed techniques via extensive simulations. Even with a tiny number of tree queries, the best action found by our approach is much safer than the baseline. Moreover, our approach constantly yields better than the baseline action in terms of objective function. This is because we revise the values and statistics maintained in the search tree and remove from them the contribution of the pruned actions. We rigorously show that our cleaning routine is necessary. Without it, at the limit of convergence of MCTS, an infinite amount of sampled dangerous actions can be detrimental to the objective function.

Index terms

Robot Safety Autonomous Agents SLAM Constrained Anytime Belief Space Planning

Related papers