← Back IROS 2024

Depth Completion Using Galerkin Attention

Yinuo Xu, Xuesong Zhang

PDF

Abstract

Current depth completion methods usually em- ploy a pair of calibrated RGB and depth sensors to reconstruct a dense depth map. Although RGB (dense) and depth (sparse) measurements are collected from the same underlying scene, they reflect different physical characteristics and thus it remains rather intricate how the devised RGB guidance scheme can effectively leads to a faithful depth recovery. Different from existing 3D geometry representations, such as point cloud, voxels or meshes, we propose to define 3D scenes as vector- valued functions, f : Ω∋(u, v) 7→(r, g, b, d) ∈R4, mapping from the image plane Ωto RGBD vectors. This scene function representation brings two benefits: 1) allowing for the adap- tation of the Galerkin method to explore the nodal basis of the scene function space, and 2) transforming the irregularly scattered (X,Y,Z) points in the Euclidean space into the depth function defined over the regular grid in the image plane. We further leverage these two benefits within a deep neural network, characterized by an efficient Galerkin attention-based RGBD function embedding to effectively explore the interaction of color and depth information, and by the utilization of equivariant convolution operation on the RGBD feature map as efficient basic blocks. Experiments show that the proposed method achieves significant performance improvement over state-of-the-arts. Code at https://github.com/ZXS-Labs/DCGA.

Index terms

Vision-Based Navigation Computer Vision for Transportation Computer Vision for Automation