Unified Generation-Refinement Planning: Bridging Guided Flow Matching and Sampling-Based MPC for Social Navigation
Kazuki Mizuta, Karen Leung
AI summary
Problem
Optimization-based planners struggle with multimodal uncertainty and poor initialization, while learning-based planners lack reliable safety constraint enforcement, making robust real-time navigation in dynamic human environments difficult.
Approach
The method generates diverse, reward-guided trajectory priors using conditional flow matching, refines them in parallel with model predictive path integral control to enforce safety and dynamics, and feeds the optimized trajectory back to warm-start the next generation step.
Key results
- Bidirectional CFM-MPPI loop preserves multimodal behavior while enforcing safety constraints
- Reward-guided CFM adapts to new objectives at test time without retraining
- Mode-selective MPPI refines distinct trajectory modes in parallel without collapsing them
- Improved safety, task performance, and computation time over standalone baselines in social navigation
Why it matters
Provides a practical, real-time planning framework for autonomous robots navigating unpredictable human crowds without sacrificing safety or adaptability.
Abstract
Robust robot planning in dynamic, human-centric environments remains challenging due to multimodal uncer- tainty, the need for real-time adaptation, and safety require- ments. Optimization-based planners enable explicit constraint handling but can be sensitive to initialization and struggle in dynamic settings. Learning-based planners capture multimodal solution spaces more naturally, but often lack reliable constraint satisfaction. In this paper, we introduce a unified generation- refinement framework that combines reward-guided conditional flow matching (CFM) with model predictive path integral (MPPI) control. Our key idea is a bidirectional information exchange between generation and optimization: reward-guided CFM produces diverse, informed trajectory priors for MPPI refinement, while the optimized MPPI trajectory warm-starts the next CFM generation step. Using autonomous social nav- igation as a motivating application, we demonstrate that the proposed approach improves the trade-off between safety, task performance, and computation time, while adapting to dynamic environments in real-time. The source code is publicly available at https://cfm-mppi.github.io.