Research Analyzer
← Back ICRA 2024

SeMLaPS: Real-Time Semantic Mapping with Latent Prior Networks and Quasi-Planar Segmentation

Jingwen Wang, Juan Jose Tarrio, Lourdes Agapito, Pablo Fernández Alcantarilla, Alexander Vakhitov

PDF

Abstract

The availability of real-time semantics greatly improves the core geometric functionality of SLAM systems, en- abling numerous robotic and AR/VR applications. We present a new methodology for real-time semantic mapping from RGB-D sequences that combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. When segmenting a new frame we perform latent feature re-projection from previous frames based on differentiable rendering. Fusing re-projected feature maps from previous frames with current- frame features greatly improves image segmentation quality, compared to a baseline that processes images independently. For 3D map processing, we propose a novel geometric quasi- planar over-segmentation method that groups 3D map elements likely to belong to the same semantic classes, relying on surface normals. We also describe a novel neural network design for lightweight semantic map post-processing. Our system achieves state-of-the-art semantic mapping quality within 2D- 3D networks-based systems and matches the performance of 3D convolutional networks on three real indoor datasets, while working in real-time. Moreover, it shows better cross-sensor generalization abilities compared to 3D CNNs, enabling training and inference with different depth sensors. Code and data can be found at https://github.com/slamcore/semlaps. .

Index terms

Semantic Scene Understanding Mapping SLAM