Multimodal Fusion-Guided Diffusion Policy for Motion Planning in Rugged and Obstacle-Dense Environments
Haoyu Xi, Wei Li, Yu Hu
AI summary
Problem
Motion planning in unstructured, rugged environments is hindered by fragmented free space, high computational costs of traditional mapping, and the lack of dynamic constraint awareness in existing learning-based methods.
Approach
The framework integrates camera and LiDAR data at the feature level to guide a diffusion policy that generates candidate trajectories, then selects the optimal path using a scoring module that evaluates semantic, geometric, and robot dynamic constraints.
Key results
- Introduced a multimodal early-fusion mechanism for joint semantic and geometric environment understanding.
- Developed a dynamics-aware trajectory determination module to ensure execution safety and reduce tracking failures.
- Deployed the framework on a mobile robot and released a self-collected dataset for rugged environments.
- Demonstrated lower collision rates and higher success rates than baseline methods in real-world experiments.
Why it matters
Enables reliable, real-time autonomous navigation for mobile robots in safety-critical, unstructured terrains like search-and-rescue or planetary exploration.
Abstract
Motion planning in unstructured environments remains a challenging task, particularly in scenarios with both dense obstacles and rugged terrain, under the requirements for safety and real-time performance. To address these challenges, this paper proposes a multimodal fusion-guided diffusion policy framework, abbreviated as M-DP, which synergistically guided by visual, LiDAR data, and goal targets. An early-fusion mechanism is designed to combine images and LiDAR points, enabling simultaneous semantic and geometric understanding of the environment to guide the diffusion policy in generating obstacle-aware candidate trajectories. Within the diffusion policy module, the Denoising Diffusion Implicit Model (DDIM) is employed to improve real-time performance. Furthermore, a trajectory determination module is proposed that incorporates not only environmental semantics and terrain geometry but also robot dynamic constraints. This ensures that the selected tra- jectory is dynamically feasible, significantly reducing the risk of tracking failures and enhancing safety in challenging conditions. The selected path effectively balances safety, goal-reaching ability, and bumpiness. Real-world experimental evaluations demonstrate the safety and effectiveness of the framework com- pared to baseline methods, with ablation studies validating the contributions of key components. Codes and our self-collected dataset are available on https://github.com/xhy1599/M-DP.