SCDCE-3D: Soft-Weighted Covariance and Dual-Branch Channel Enhancement for 3D Place Recognition in Complex Orchard Environments
Yuping Tan, Chunjiang Zhao, Qin Zhao, Hei Xinhong, Song Xiaogang
AI summary
Problem
Existing 3D place recognition methods struggle in orchard environments due to unreliable GNSS, sparse and occluded LiDAR scans, and severe structural ambiguity from repetitive tree rows.
Approach
The framework integrates a soft-weighted covariance module that adaptively down-weights noisy or overlapping points with a dual-branch channel enhancement mechanism that dynamically amplifies discriminative features, optimized via multi-level triplet learning.
Key results
- Adaptive soft-weighted covariance module suppresses noise and cross-row interference
- Dual-branch channel enhancement dynamically highlights discriminative feature channels
- Multi-level triplet learning jointly optimizes final descriptors and intermediate statistical features
- Achieves 5.6% average improvement in top-1 recall over the second-best baseline on the HORTO-3DLM+ dataset
Why it matters
Enables robust, long-term autonomous navigation for agricultural robots in GNSS-denied orchard environments where traditional localization fails.
Abstract
Recent progress in 3D place recognition has deliv- ered strong results in urban and indoor scenarios, but orchards remain largely unexplored. In these environments, unreliable or absent GNSS signals necessitate LiDAR-based place recognition for robust long-term localization, yet challenges such as ill- defined geometry, semi-transparent foliage, and severe inter- /intra-row overlaps cause high structural ambiguity. To address these challenges, we propose SCDCE-3D, a novel framework that integrates soft-weighted covariance representation with dual-branch channel enhancement. The soft-weighted covari- ance module adaptively down-weights noisy or overlapping points using a sigmoid-based weighting strategy, enabling robust second-order statistical representation that suppresses cross-row interference. In parallel, a dual-branch backbone extracts complementary global and local features, which drive a dynamic channel enhancement mechanism to emphasize discriminative feature channels while suppressing redundancy. Furthermore, multi-level triplet learning is applied not only to the final descriptor but also to intermediate statistical features, reinforcing robustness against structural ambiguity. Experi- ments on orchard-based LiDAR datasets demonstrate that SCDCE-3D significantly outperforms state-of-the-art methods in both recall and robustness, offering a reliable solution for long-term 3D place recognition in agricultural robotics. Code is available at https://github.com/typist2001/SCDCE-3D.