← Back ICRA 2026

Transformation-Domain Gaussian Smoothing for Translational Direct Visual Servoing

Amneh Nasir, Djemaa Kachi, Antoine N. André, Guillaume Caron

PDF

AI summary

Key figure (auto-extracted from paper)

Transformation-domain Gaussian smoothing significantly enlarges the convergence basin of direct visual servoing without sacrificing final positioning accuracy.

Direct visual servoing cost function smoothing Gaussian homotopy robotic vision transformation-domain kernel convergence basin

Problem

Direct visual servoing relies on a highly nonconvex photometric cost function that creates many local minima, severely limiting its convergence domain when the camera starts far from the target pose.

Approach

The authors adapt a Gaussian homotopy framework to smooth the sum-of-squared-differences cost in the transformation parameter space, deriving a spatially varying, motion-adaptive kernel for 3-DoF translation and integrating it into a Gauss-Newton control law.

Key results

Closed-form derivation of a motion-adaptive transformation kernel and interaction matrix for 3-DoF translation
Adaptation of Gaussian cost smoothing from cross-correlation to the SSD objective
Experimental validation on a 3-DoF UR5 arm showing wider convergence basins than Photometric Gaussian Mixtures
Graduated smoothing suppresses spurious local minima while preserving gradient structure for accurate convergence

Why it matters

Enables more robust and reliable robot motion control from larger initial pose errors, reducing the need for precise initial alignment in direct visual servoing applications.

Abstract

Direct visual servoing (DVS) uses raw pixel in- tensities to control robot motion, yielding high accuracy at convergence. However, the associated photometric cost func- tion is highly nonconvex, which leads to a narrow domain of convergence due to local minima. This work addresses that issue by adapting a Gaussian homotopy framework for cost function smoothing from cross-correlation to the sum of squared differences (SSD) objective used in DVS. The result is a spatially varying, transformation-domain kernel that depends on the motion model, producing smoother cost landscapes and enlarging the convergence basin. We first apply the smoothing to an SSD cost, derive its corresponding transformation kernel for the motion model in the camera domain, and then incorporate it into a DVS control law. The method is compared against uniform image domain blurring via Photometric Gaussian Mixtures. Experiments with an eye-in-hand robotic arm setup over three degrees of freedom translation and with different initial poses show that cost smoothing significantly increases the convergence domain while preserving the accuracy of DVS.

Index terms

Visual Servoing Optimization and Optimal Control