← Back IROS 2024

MaskingDepth: Masked Consistency Regularization for Semi-Supervised Monocular Depth Estimation

Jongbeom Baek, Gyeongnyeon Kim, Seonghoon Park, Honggyu An, Matteo Poggi, Seungryong Kim

PDF

Abstract

We propose MaskingDepth, a semi-supervised learning framework for monocular depth estimation. Mask- ingDepth is designed to enforce consistency between the depths obtained from strongly-augmented images and the pseudo- depths derived from weakly-augmented images, which enables mitigating the reliance on large ground-truth depth quantities. In this framework, we leverage uncertainty estimation to only retain high-confident depth predictions from the weakly- augmented branch as pseudo-depths. We also present a novel data augmentation, dubbed K-way disjoint masking, that takes advantage of a na ̈ıve token masking strategy as an augmen- tation, while avoiding its scale ambiguity problem between depths from weakly- and strongly-augmented branches and risk of missing small-scale objects. Experiments on KITTI and NYU-Depth-v2 datasets demonstrate the effectiveness of each component, its robustness to the use of fewer depth-annotated images, and superior performance compared to other state-of- the-art semi-supervised learning methods for monocular depth estimation.

Index terms

Deep Learning for Visual Perception Visual Learning Deep Learning Methods