← Back ICRA 2023

Self-Supervised Point Cloud Understanding Via Mask Transformer and Contrastive Learning

DI WANG, Zhi-Xin Yang

PDF

Abstract

Self-supervised point cloud understanding can pre-train the point cloud learning network on a large dataset, which helps boost the performance of fine-tuning on other smaller datasets in downstream tasks. Motivated to design an efficient self-supervised pre-training strategy and capture useful and discriminative representations of the 3D point cloud, we propose ContrastMPCT, a self-reconstruction scheme with the contrastive learning principle. Specifically, two contrastive loss functions are designed for 3D point clouds to maximize the dependence between the input tokens and output tokens of the encoder and fasten the convergence of the model. Extensive experiments show that our pre-training strategy of ContrastM- PCT can effectively improve the fine-tuning performance on the downstream tasks, including object classification and part segmentation. Moreover, compared with both CNN-based and Transformer-based existing works, the superior results indi- cate the efficacy of the proposed method. The source code will be available at https://github.com/wendydidi/ ContrastMPCT.git.

Index terms

Deep Learning for Visual Perception Visual Learning Recognition