Crowd-FM: Learned Optimal Selection of Conditional Flow Matching-Generated Trajectories for Crowd Navigation
Antareep Singha, Laksh Nanwani, Mathai Mathew Pulicken, Samkit Jain, Phani Teja Singamaneni, Arun Kumar Singh, Madhava Krishna
AI summary
Problem
Mobile robots struggle to plan safely and efficiently in dense, unstructured crowds while mimicking human-like motion for better social acceptance. Existing classical and learning-based planners often fail to balance computational efficiency, robustness, and human-likeness in highly dynamic environments.
Approach
The method uses a Conditional Flow Matching model to rapidly generate a diverse batch of collision-free trajectory primitives from 2D sensor data, then applies a learned Transformer-based scoring function to select the most human-like option, followed by kinodynamic refinement.
Key results
- CFM policy achieves higher collision-free success rates than learning-based baselines
- Inference-time refinement outperforms expensive optimization-based planners
- Learned scoring function selects trajectories closer to human expert demonstrations than hand-crafted costs
- Efficient 2D LiDAR conditioning enables real-time deployment on resource-constrained platforms
Why it matters
Enables real-world deployment of socially acceptable, safe mobile robots in complex human environments by unifying robust trajectory generation with human-like behavior selection.
Abstract
Safe and computationally efficient local planning for mobile robots in dense, unstructured human crowds remains a fundamental challenge. Moreover, ensuring that robot trajectories are similar to how a human moves will increase the acceptance of the robot in human environments. In this paper, we present Crowd-FM, a learning-based approach to address both safety and human-likeness challenges. Our approach has two novel components. First, we train a Conditional Flow-Matching (CFM) policy over a dataset of optimally controlled trajectories to learn a set of collision-free primitives that a robot can choose at any given scenario. The chosen optimal control solver can generate multi-modal collision-free trajectories, allowing the CFM policy to learn a diverse set of maneuvers. Secondly, we learn a score function over a dataset of human demonstration trajectories that provides a human-likeness score for the flow primitives. At inference time, computing the optimal trajectory requires selecting the one with the highest score. Our approach improves the state-of-the-art by showing that our CFM policy alone can produce collision-free navigation with a higher success rate than existing learning-based baselines. Furthermore, when augmented with inference-time refinement, our approach can outperform even expensive optimisation-based planning approaches. Finally, we validate that our scoring network can select trajectories closer to the expert data than a manually designed cost function. * Equal contribution. 1 Robotics Research Center, IIIT Hyderabad, India. {lakshanshul, math- ewp8616}@gmail.com, {samkit.jain@students, mkrishna}@iiit.ac.in 2 Nanyang Technological University, Singapore. antareep002@e.ntu.edu.sg 3 University of Tartu, Estonia. aks1812@gmail.com 4 Inria, Universit ́e de Lorraine, France. phaniteja.sp@gmail.com Project Page: https://smart-wheelchair-rrc.github.io/crowdfm-webpage/ We acknowledge IHub-Data(Project:M2-029) for funding this work. It was also co-funded by the European Union and Estonian Research Council via Project:TEM-TA101 and Grant:PSG753 by Estonian Research Council.