E2O-SLAM: A Hierarchical Visual SLAM Framework Using Edge-based and Object-level Representations
Eunseon Choi, Soohee Han
AI summary
Problem
Visual SLAM systems frequently fail in low-texture or dynamically lit environments due to unreliable feature tracking or coarse geometric constraints from purely semantic approaches.
Approach
The framework unifies point-level keypoints, mid-level organized edge structures, and high-level object semantics into a hierarchical pipeline that guides motion estimation and data association.
Key results
- Competitive relative trajectory error on TUM RGB-D sequences without global optimization
- Outperforms ORB-SLAM3 in relative trajectory error across multiple challenging indoor sequences
- Reliable relative motion estimation through hierarchical point-edge-object integration
- Unified RGB-D tracking pipeline leveraging Wasserstein-distance object association
Why it matters
Enables more robust and reliable robot navigation in texture-poor or illumination-varying environments where traditional SLAM struggles.
Abstract
In this paper, we present a hierarchical simulta- neous localization and mapping (SLAM) system that leverages point-level features, mid-level geometric organized edge repre- sentations [1], and high-level object semantics within a unified framework. While object-level SLAM provides semantic infor- mation and improves long-term data association, it often suffers from coarse geometric constraints and unreliable detections. In contrast, organized edge representations capture rich structural and textural information, offering stable geometric cues in low- texture or challenging environments. By hierarchically integrating these complementary represen- tations, the proposed system achieves robust camera tracking, reliable data association, and consistent mapping.