Transformation-Domain Gaussian Smoothing for Translational Direct Visual Servoing
Amneh Nasir, Djemaa Kachi, Antoine N. André, Guillaume Caron
AI summary
Problem
Direct visual servoing relies on a highly nonconvex photometric cost function that creates many local minima, severely limiting its convergence domain when the camera starts far from the target pose.
Approach
The authors adapt a Gaussian homotopy framework to smooth the sum-of-squared-differences cost in the transformation parameter space, deriving a spatially varying, motion-adaptive kernel for 3-DoF translation and integrating it into a Gauss-Newton control law.
Key results
- Closed-form derivation of a motion-adaptive transformation kernel and interaction matrix for 3-DoF translation
- Adaptation of Gaussian cost smoothing from cross-correlation to the SSD objective
- Experimental validation on a 3-DoF UR5 arm showing wider convergence basins than Photometric Gaussian Mixtures
- Graduated smoothing suppresses spurious local minima while preserving gradient structure for accurate convergence
Why it matters
Enables more robust and reliable robot motion control from larger initial pose errors, reducing the need for precise initial alignment in direct visual servoing applications.
Abstract
Direct visual servoing (DVS) uses raw pixel in- tensities to control robot motion, yielding high accuracy at convergence. However, the associated photometric cost func- tion is highly nonconvex, which leads to a narrow domain of convergence due to local minima. This work addresses that issue by adapting a Gaussian homotopy framework for cost function smoothing from cross-correlation to the sum of squared differences (SSD) objective used in DVS. The result is a spatially varying, transformation-domain kernel that depends on the motion model, producing smoother cost landscapes and enlarging the convergence basin. We first apply the smoothing to an SSD cost, derive its corresponding transformation kernel for the motion model in the camera domain, and then incorporate it into a DVS control law. The method is compared against uniform image domain blurring via Photometric Gaussian Mixtures. Experiments with an eye-in-hand robotic arm setup over three degrees of freedom translation and with different initial poses show that cost smoothing significantly increases the convergence domain while preserving the accuracy of DVS.