← Back ICRA 2026

Leveraging Geometric Priors for Unaligned Scene Change Detection

Ziling Liu, Ziwei Chen, Mingqi Gao, Jinyu Yang, Feng Zheng

PDF

AI summary

Key figure (auto-extracted from paper)

Integrating geometric priors from foundation models with a training-free framework enables robust, annotation-free scene change detection under large viewpoint shifts.

Scene Change Detection Unaligned Viewpoints Geometric Priors Foundation Models Zero-Shot Detection Occlusion Handling

Problem

Existing unaligned scene change detection methods rely on 2D visual cues that fail under large viewpoint changes, struggling to establish robust correspondences, identify visual overlaps, and handle occlusions without generalizable multi-view knowledge.

Approach

The authors leverage a Geometric Foundation Model to extract 3D depth and camera poses, establishing explicit geometric correspondences and occlusion masks. These priors guide a training-free visual foundation model (SAM) to predict change masks without requiring task-specific training data.

Key results

First application of Geometric Foundation Model priors to unaligned scene change detection
Training-free framework that eliminates reliance on large-scale annotated datasets
Superior and robust F1-scores across PSCD, ChangeSim, and PASLCD datasets
Explicit occlusion detection and geometric correspondence outperform 2D flow-based baselines

Why it matters

Provides a reliable, zero-shot solution for real-world robotic and autonomous driving applications where camera viewpoints and lighting frequently change.

Abstract

Unaligned Scene Change Detection aims to detect scene changes between image pairs captured at different times without assuming viewpoint alignment. To handle viewpoint variations, current methods rely solely on 2D visual cues to establish cross-image correspondence to assist change detection. However, large viewpoint changes can alter visual observations, causing appearance-based matching to drift or fail. Addition- ally, supervision limited to 2D change masks from small-scale SCD datasets restricts the learning of generalizable multi-view knowledge, making it difficult to reliably identify visual overlaps and handle occlusions. This lack of explicit geometric reasoning represents a critical yet overlooked limitation. In this work, we introduce geometric priors for the first time to address the core challenges of unaligned SCD, for reliable identifica- tion of visual overlaps, robust correspondence establishment, and explicit occlusion detection. Building on these priors, we propose a training-free framework that integrates them with the powerful representations of a visual foundation model to enable reliable change detection under viewpoint misalignment. Through extensive evaluation on the PSCD, ChangeSim, and PASLCD datasets, we demonstrate that our approach achieves superior and robust performance. Our code will be released at https://github.com/ZilingLiu/GeoSCD.

Index terms

Computer Vision for Automation