← Back ICRA 2026

A Contrastive Few-Shot RGB-D Traversability Segmentation Framework for Indoor Robotic Navigation

Qiyuan An, Tuan Dang, Fillia Makedon

PDF

AI summary

Key figure (auto-extracted from paper)

Leveraging negative contrastive learning and sparse 1D depth alignment significantly boosts few-shot traversability segmentation accuracy for indoor robots.

Few-shot segmentation RGB-D fusion Traversability segmentation Negative contrastive learning Indoor robotics Sparse depth

Problem

Pure vision models struggle with thin indoor obstacles, and traditional few-shot segmentation overfits to positive freespace prototypes while ignoring obstacles. Additionally, aligning sparse 1D LiDAR data with RGB images remains a major practical challenge.

Approach

The framework fuses RGB images with sparse 1D depth via a two-stage attention module for dynamic alignment, and introduces a negative contrastive learning branch that explicitly models obstacle prototypes to refine free-space predictions.

Key results

Up to 9% mIoU improvement over SOTA FSS and RGB-D baselines in 1-shot and 5-shot settings
Novel two-stage attention module dynamically aligns unregistered 1D depth with RGB without explicit calibration
Negative contrastive learning explicitly exploits obstacle prototypes to reduce overfitting and improve generalization
Release of a large-scale custom indoor RGB-D traversability dataset with sparse 1D depth annotations

Why it matters

Provides a practical, low-cost solution for robust indoor robot navigation by enabling accurate traversability detection with minimal labeled data and sparse depth sensors.

Abstract

Indoor traversability segmentation aims to iden- tify safe, navigable free space for autonomous agents, which is critical for robotic navigation. Pure vision-based models often fail to detect thin obstacles, such as chair legs, which can pose serious safety risks. We propose a multi-modal segmentation framework that leverages RGB images and sparse 1D laser depth information to capture geometric interactions and improve the detection of challenging obstacles. To reduce the reliance on large labeled datasets, we adopt the few-shot segmentation (FSS) paradigm, enabling the model to generalize from limited annotated examples. Traditional FSS methods focus solely on positive prototypes, often leading to overfitting to the support set and poor generalization. To address this, we introduce a negative contrastive learning (NCL) branch that leverages negative prototypes (obstacles) to refine free-space predictions. Additionally, we design a two-stage attention depth module to align 1D depth vectors with RGB images both hor- izontally and vertically. Extensive experiments on our custom- collected indoor RGB-D traversability dataset demonstrate that our method outperforms state-of-the-art FSS and RGB- D segmentation baselines, achieving up to 9% higher mIoU under both 1-shot and 5-shot settings. These results highlight the effectiveness of leveraging negative prototypes and sparse depth for robust and efficient traversability segmentation.

Index terms

Object Detection Segmentation and Categorization Big Data in Robotics and Automation RGB-D Perception