← Back ICRA 2026

Unified Generation-Refinement Planning: Bridging Guided Flow Matching and Sampling-Based MPC for Social Navigation

Kazuki Mizuta, Karen Leung

PDF

AI summary

Key figure (auto-extracted from paper)

Bidirectional coupling of reward-guided flow matching and sampling-based MPC yields safer, faster, and more adaptable real-time social navigation than either method alone.

Conditional Flow Matching Model Predictive Path Integral Social Navigation Trajectory Planning Reward-Guided Generation Multimodal Planning

Problem

Optimization-based planners struggle with multimodal uncertainty and poor initialization, while learning-based planners lack reliable safety constraint enforcement, making robust real-time navigation in dynamic human environments difficult.

Approach

The method generates diverse, reward-guided trajectory priors using conditional flow matching, refines them in parallel with model predictive path integral control to enforce safety and dynamics, and feeds the optimized trajectory back to warm-start the next generation step.

Key results

Bidirectional CFM-MPPI loop preserves multimodal behavior while enforcing safety constraints
Reward-guided CFM adapts to new objectives at test time without retraining
Mode-selective MPPI refines distinct trajectory modes in parallel without collapsing them
Improved safety, task performance, and computation time over standalone baselines in social navigation

Why it matters

Provides a practical, real-time planning framework for autonomous robots navigating unpredictable human crowds without sacrificing safety or adaptability.

Abstract

Robust robot planning in dynamic, human-centric environments remains challenging due to multimodal uncer- tainty, the need for real-time adaptation, and safety require- ments. Optimization-based planners enable explicit constraint handling but can be sensitive to initialization and struggle in dynamic settings. Learning-based planners capture multimodal solution spaces more naturally, but often lack reliable constraint satisfaction. In this paper, we introduce a unified generation- refinement framework that combines reward-guided conditional flow matching (CFM) with model predictive path integral (MPPI) control. Our key idea is a bidirectional information exchange between generation and optimization: reward-guided CFM produces diverse, informed trajectory priors for MPPI refinement, while the optimized MPPI trajectory warm-starts the next CFM generation step. Using autonomous social nav- igation as a motivating application, we demonstrate that the proposed approach improves the trade-off between safety, task performance, and computation time, while adapting to dynamic environments in real-time. The source code is publicly available at https://cfm-mppi.github.io.

Index terms

Human-Aware Motion Planning Machine Learning for Robot Control AI-Enabled Robotics