Vectorized Online POMDP Planning
Marcus Hoerger, Muhammad Rafi Sudrajat, Hanna Kurniawati
AI summary
Problem
Existing parallel POMDP solvers suffer from synchronization bottlenecks and coordination overhead when interleaving optimization and value estimation, limiting their scalability on modern GPUs.
Approach
VOPP reformulates POMDP planning as fully vectorized tensor operations, eliminating inter-process synchronization by analytically solving part of the optimization and representing the belief tree as batched data structures.
Key results
- First fully vectorized online POMDP solver on GPU without synchronization
- At least 20× speedup over parallel solvers, exceeding 100× on some benchmarks
- Outperforms sequential solvers using 1000× less planning budget
- Open-source implementation for large-scale POMDP benchmarks
Why it matters
Enables scalable, real-time decision-making for autonomous robots and other sequential planning tasks by unlocking the full parallel throughput of modern GPUs.
Abstract
Planning under partial observability is an essential capability of autonomous robots. The Partially Observable Markov Decision Process (POMDP) provides a powerful frame- work for planning under partial observability problems, captur- ing the stochastic effects of actions and the limited information available through noisy observations. POMDP solving could benefit tremendously from massive parallelization on today’s hardware, but parallelizing POMDP solvers has been challeng- ing. Most solvers rely on interleaving numerical optimization over actions with the estimation of their values, which creates dependencies and synchronization bottlenecks between parallel processes that can offset the benefits of parallelization. In this paper, we propose Vectorized Online POMDP Planner (VOPP), a novel parallel online solver that leverages a recent POMDP formulation which analytically solves part of the optimization component, leaving numerical computations to consist of only estimation of expectations. VOPP represents all data structures related to planning as a collection of tensors, and implements all planning steps as fully vectorized computations over this representation. The result is a massively parallel online solver with no dependencies or synchronization bottlenecks between concurrent processes. Experimental results indicate that VOPP is at least 20× more efficient in computing near-optimal solutions compared to an existing state-of-the-art parallel online solver. Moreover, VOPP outperforms state-of-the-art sequential online solvers, while using a planning budget that is 1000× smaller.