← Back ICRA 2026

From Simulation to Deployment: Curriculum-Based Domain Adaptation for Semantic Segmentation in Autonomous Forklifts

Christof Schützenhöfer, Patrick Rechberger, Thomas Ulz, Christian Steger

PDF

AI summary

Key figure (auto-extracted from paper)

A progressive curriculum-based adaptation framework significantly boosts semantic segmentation accuracy for autonomous forklifts across varying industrial sites while minimizing manual annotation costs.

semantic segmentation domain adaptation synthetic data pseudo-labeling autonomous forklifts sim-to-real transfer

Problem

Deploying semantic segmentation models for autonomous forklifts is hindered by visual variations across industrial sites, leading to poor cross-domain generalization and expensive re-annotation efforts.

Approach

The method progressively adapts a model by pretraining on synthetic data, fine-tuning on a labeled real-world source, and transferring to a new target domain using filtered pseudo-labels combined with anchor samples to prevent drift and handle class imbalance.

Key results

Progressive sim-to-real-to-real training pipeline reduces annotation overhead
mIoU increases from 67.37 to 71.36 under moderate domain shift
mIoU increases from 49.57 to 57.22 under hard domain shift
Anchor replay and class-aware filtering stabilize adaptation and mitigate class imbalance

Why it matters

It enables scalable, cost-effective deployment of robust visual perception systems for industrial robotics across diverse and changing warehouse environments.

Abstract

Deploying semantic segmentation models for au- tonomous forklifts in industrial environments is challenging because visual conditions vary across sites, leading to poor cross-domain generalization and costly re-annotation efforts. We propose a curriculum-based domain adaptation framework that progressively transfers a segmentation model from simu- lation to real-world industrial deployment. The model is first pretrained on synthetic datasets with increasing complexity, then fine-tuned on a labeled real source domain to reduce the sim-to-real gap and adapt to camera-specific characteristics. Finally, it is adapted to a new target domain using pseudo- label-based self-training. To reduce drift during target adapta- tion, pseudo-labeled target samples are combined with labeled samples from the source-real domain, while a replay buffer improves robustness to class imbalance by oversampling rare classes. Preliminary experiments with DDRNet demonstrate improved performance under both moderate and hard domain shifts, with mIoU gains from 67.37 to 71.36 and from 49.57 to 57.22, respectively. The results highlight the potential of progressive multi-domain adaptation for scalable industrial robotic perception. semantic segmentation, synthetic data, pseudo labeling

Index terms

Industrial Robots Object Detection Segmentation and Categorization Computer Vision for Transportation