← Back ICRA 2024

Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling

Jean Lahoud, Fahad Khan, Hisham Cholakkal, Rao Anwer, Salman Khan

PDF

Abstract

Existing 3D understanding datasets typically pro- vide annotations for a limited number of object classes, with sufficient examples per class. However, real-world object classes are not equally represented in practical settings, leading to poor performance on rarely-occurring categories if the class imbalance is neglected. In this work, we address the challenge of 3D semantic segmentation with a long-tail distribution of classes. Common methods to reduce class imbalance during training include data re-sampling, loss re-weighting, and trans- fer learning. In contrast, our work proposes to effectively utilize network classifier weights in 3D models to balance the training on long-tail class distributions. While previous work in the 2D domain has studied imposing constraints on the classifier weights to regularize the training, it is sensitive to hyper-parameter choices and has not been yet explored for the 3D domain. To address these challenges, our work proposes adaptive regularization for frequent classes and sampling- based regularization for rare classes that alleviate the need to manually select thresholds and can dynamically focus training on the hard classes. Our experiments on the large-scale Scan- Net200 benchmark show that our method achieves improved performance, surpassing methods that rely on re-sampling, re- weighting, and pre-training.

Index terms

RGB-D Perception Recognition Deep Learning for Visual Perception