← Back ICRA 2026

MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control

Basant Sharma, Prajyot Jadhav, Pranjal Paul, Madhava Krishna, Arun Kumar Singh

PDF

AI summary

Key figure (auto-extracted from paper)

Co-training a probabilistic collision model with a risk metric enables safe, high-speed monocular navigation in cluttered environments where noisy depth estimates typically fail.

monocular navigation collision avoidance risk-aware MPC probabilistic planning vision foundation models cluttered environments

Problem

Monocular navigation in unknown, cluttered environments lacks reliable depth for collision checking, and existing methods using estimated depth from vision foundation models are too noisy for zero-shot navigation.

Approach

The method uses noisy depth estimates as context for a learned collision model that predicts obstacle clearance distributions. These predictions guide a risk-aware MPC planner, with the model and risk metric co-trained on safe and unsafe trajectories to ensure calibrated uncertainty.

Key results

Joint training pipeline ensures well-calibrated uncertainty in the collision model
Significant reduction in collision rates compared to ROSNAV and MonoNav baselines
Improved goal-reaching success and navigation speed in highly cluttered real-world settings
Statistically consistent collision predictions validated against ground-truth obstacle clearance

Why it matters

Enables safe, lightweight vision-only navigation for resource-constrained robots in complex environments without requiring LiDAR or depth sensors.

Abstract

Navigating unknown environments with a single RGB camera is challenging, as the lack of depth information prevents reliable collision-checking. While some methods use estimated depth to build collision maps, we found that depth estimates from vision foundation models are too noisy for zero-shot navigation in cluttered environments. We propose an alternative approach: instead of using noisy estimated depth for direct collision-checking, we use it as a rich context input to a learned collision model. This model predicts the distribution of minimum obstacle clearance that the robot can expect for a given control sequence. At inference, these predictions inform a risk-aware MPC planner that minimizes estimated collision risk. We proposed a joint learning pipeline that co-trains the collision model and risk metric using both safe and unsafe trajectories. Crucially, our joint-training ensures well calibrated uncertainty in our collision model that improves navi- gation in highly cluttered environments. Consequently, real-world experiments show reductions in collision-rate and improvements in goal reaching and speed over several strong baselines.

Index terms

Planning under Uncertainty Collision Avoidance Vision-Based Navigation