CD-FKD: Cross-Domain Feature Knowledge Distillation for Robust Single-Domain Generalization in Object Detection
Junseok Lee, Sungho Shin, Seongju Lee, Kyoobin Lee
AI summary
Problem
Object detectors trained on a single source domain fail to generalize to unseen target domains due to environmental shifts like weather and lighting, while existing single-domain generalization methods often require impractical multi-source data or degrade source performance.
Approach
A teacher-student knowledge distillation framework where a frozen teacher processes clean source images while a student learns from corrupted and downscaled versions, using global and instance-wise feature alignment losses to bridge the domain gap.
Key results
- Achieves 38.3% average mAP@0.5 across four unseen weather domains on the diverse weather benchmark
- Surpasses the previous state-of-the-art DivAlign method by 2.8% mAP@0.5
- Boosts source domain detection accuracy to 62.7% mAP@0.5 without sacrificing generalization
- Maintains robust feature extraction under severe corruptions like fog, rain, and low-light conditions
Why it matters
Provides a practical, data-efficient solution for deploying reliable object detectors in real-world environments like autonomous driving and surveillance where target domain data is unavailable.
Abstract
Single-domain generalization is essential for object detection, particularly when training models on a single source domain and evaluating them on unseen target domains. Domain shifts, such as changes in weather, lighting, or scene conditions, pose significant challenges to the generalization ability of existing models. To address this, we propose Cross- Domain Feature Knowledge Distillation (CD-FKD), which enhances the generalization capability of the student network by leveraging both global and instance-wise feature distillation. The proposed method uses diversified data through downscaling and corruption to train the student network, whereas the teacher network receives the original source domain data. The student network mimics the features of the teacher through both global and instance-wise distillation, enabling it to extract object-centric features effectively, even for objects that are difficult to detect owing to corruption. Extensive experiments on challenging scenes demonstrate that CD-FKD outperforms state-of-the-art methods in both target domain generalization and source domain performance, validating its effectiveness in improving object detection robustness to domain shifts. This approach is valuable in real-world applications, like autonomous driving and surveillance, where robust object detection in diverse environments is crucial.