ProbeMDE: Uncertainty-Guided Active Proprioception for Monocular Depth Estimation in Surgical Robotics
Britton Jordan, Jordan Thompson, Jesse F. d'Almeida, Hao Li, Nithesh Kumar, Susheela Sharma Stern, James Ferguson, Ipek Oguz, Robert James Webster III, Daniel Brown, Alan Kuntz
AI summary
Problem
Monocular depth estimation fails in challenging surgical environments due to textureless surfaces and occlusions, yet actively gathering sparse depth measurements via robot touch is costly and unoptimized.
Approach
The framework trains an ensemble of depth models to quantify predictive uncertainty, then uses Stein Variational Gradient Descent on the uncertainty gradient to optimally select diverse, informative points for the robot to physically probe.
Key results
- Outperforms baselines across standard depth metrics in simulation
- Achieves higher accuracy with fewer measurements in physical phantom trials
- First application of proprioception to improve endoscopic monocular depth estimation
- Prevents mode collapse during active sensing via uncertainty gradient optimization
Why it matters
Provides a cost-effective, uncertainty-driven sensing strategy that enhances depth perception reliability for minimally invasive surgical robots navigating complex anatomical structures.
Abstract
Monocular depth estimation (MDE) provides a useful tool for robotic perception, but its predictions are often uncertain and inaccurate in challenging environments such as surgical scenes where textureless surfaces, specular reflections, and occlusions are common. To address this, we propose ProbeMDE, a cost-aware active sensing framework that combines RGB images with sparse proprioceptive mea- surements for MDE. Our approach utilizes an ensemble of MDE models to predict dense depth maps conditioned on both RGB images and a sparse set of known depth measurements obtained via proprioception, where the robot has touched the environment in a known configuration. We quantify predictive uncertainty via the ensemble’s variance and measure the gradient of the uncertainty with respect to candidate mea- surement locations. To prevent mode collapse while selecting maximally informative locations to propriocept (touch), we leverage Stein Variational Gradient Descent (SVGD) over this gradient map. We validate our method in both simulated and physical experiments on central airway obstruction surgical phantoms. Our results demonstrate that our approach outper- forms baseline methods across standard depth estimation met- rics, achieving higher accuracy while minimizing the number of required proprioceptive measurements. The project website is brittonjordan.github.io/probe mde