← Back ICRA 2026

X-MOS: A Heterogeneous Cross-LiDAR Generalization Framework for Moving Object Segmentation

Minjae Lee, Ilhwan Ha, Sang-Min Choi, Gun-Woo Kim, Suwon Lee

PDF

AI summary

Key figure (auto-extracted from paper)

X-MOS enables a single moving object segmentation model to generalize robustly across diverse LiDAR sensors by using sensor-specific expert teachers and privileged sensor-type information during training.

Moving Object Segmentation LiDAR Generalization Knowledge Distillation Multi-Teacher Learning Domain Shift Autonomous Driving

Problem

Deep learning models for moving object segmentation degrade significantly when deployed on different LiDAR sensors due to domain shifts caused by varying hardware specifications. Naively training on combined heterogeneous data biases models toward high-density sensors while failing on sparse ones.

Approach

The framework generates sensor-specific expert teacher models and uses the sensor type as privileged information to selectively activate the appropriate teacher during training. This multi-teacher knowledge distillation strategy guides a single student model to learn unbiased, cross-sensor generalizable features.

Key results

Mitigates training bias toward dense sensors, achieving balanced performance across diverse LiDAR hardware
Reaches an overall test mIoU of 0.717 on the HeLiMOS dataset, surpassing naive training and individual experts
More than doubles segmentation accuracy on the challenging 16-channel Velodyne sensor
Demonstrates strong zero-shot generalization to unseen datasets with similar sensor types

Why it matters

Enables practical, hardware-agnostic perception systems for autonomous vehicles by eliminating the need for sensor-specific model retraining.

Abstract

Moving object segmentation (MOS) is founda- tional for autonomous vehicle safety. However, the increasing diversity of LiDAR sensors creates a significant domain shift problem, causing models trained on one sensor to perform poorly when deployed on another. A naive approach of training on combined data from heterogeneous sensors leads to a biased model that favors high-density sensors while failing on sparse, low-resolution sensors. To address this issue, we propose X-MOS, a novel generalization framework based on multi-teacher knowledge distillation. X-MOS generates sensor- specific expert teacher models and employs a sensor-aware knowledge distillation strategy. This strategy uses the sensor type as privileged information to activate the most appropriate teacher at each training step, providing unambiguous learning signals to a single student model. Extensive experiments on the HeLiMOS dataset, which comprises four different LiDAR sensors, demonstrate the effectiveness of our framework. X- MOS mitigates training bias and achieves an overall test mIoU of 0.717, outperforming both naive training and the best individual expert teacher. Notably, it more than doubles the performance on the most challenging low-channel sensor. Fur- thermore, our model exhibits strong zero-shot generalization to unseen datasets with similar sensor types. This work provides a robust and scalable methodology for achieving cross-sensor generalization, which is foundational for more practical and adaptable perception systems in autonomous driving.

Index terms

Computer Vision for Transportation Sensor Fusion Object Detection Segmentation and Categorization