HBRB-BoW: A Retrained Bag-Of-Words Vocabulary for ORB-SLAM Via Hierarchical BRB-KMeans
Minjae Lee, Sang-Min Choi, Gun-Woo Kim, Suwon Lee
AI summary
Problem
Conventional binary clustering in ORB-SLAM's Bag-of-Words framework loses fine-grained feature information and propagates quantization errors through its hierarchical tree, degrading visual word quality and loop detection accuracy.
Approach
HBRB-BoW converts binary descriptors to real-valued data at the root node, performs standard k-means clustering in the real-valued domain throughout the hierarchy, and only binarizes at the leaf nodes to minimize information loss.
Key results
- Translation ATE reduced by 30.8% on KITTI dataset
- Mean relative pose error improved by 10.3%
- Successful loop closure detection in challenging sequences where baseline failed
- More discriminative visual vocabulary with preserved descriptor fidelity
Why it matters
Offers a direct, framework-compatible vocabulary upgrade that enhances SLAM robustness and accuracy for autonomous navigation and robotics applications.
Abstract
In visual simultaneous localization and mapping (SLAM), the quality of the visual vocabulary is fundamental to the system’s ability to represent environments and recognize locations. While ORB-SLAM is a widely used framework, its binary vocabulary, trained through the k-majority-based bag- of-words (BoW) approach, suffers from inherent precision loss. The inability of conventional binary clustering to represent subtle feature distributions leads to the degradation of visual words, a problem that is compounded as errors accumulate and propa- gate through the hierarchical tree structure. To address these structural deficiencies, this paper proposes hierarchical binary- to-real-and-back (HBRB)-BoW, a refined hierarchical binary vocabulary training algorithm. By integrating a global real- valued flow within the hierarchical clustering process, our method preserves high-fidelity descriptor information until the final binarization at the leaf nodes. Experimental results demonstrate that the proposed approach yields a more discriminative and well-structured vocabulary than traditional methods, significantly enhancing the representational integrity of the visual dictionary in complex environments. Furthermore, replacing the default ORB-SLAM vocabulary file with our HBRB-BoW file is expected to improve performance in loop closing and relocalization tasks.