STEAM-LIVO: Spatio-Temporally Adaptive Manifold Lidar-Inertial-Visual Odometry for Sensor Degradation in Unstructured Natural Aquatic-Terrestrial Scenes
Yubo Guo, Gang Peng, Jialuo Li, Hai-Tao Zhang
AI summary
Problem
Autonomous systems struggle with localization in unstructured natural environments due to cross-modal sensor degradation (e.g., LiDAR sparsity, visual feature loss) and asynchronous sensor data, which cause drift or failure in existing SLAM frameworks.
Approach
The method uses an IMU-centric iterative error-state Kalman filter on Lie group manifolds to jointly optimize LiDAR point-to-plane and visual reprojection residuals, while a historical state-covariance buffer corrects delayed or out-of-sequence measurements without re-integrating IMU data.
Key results
- 1.77% average relative pose error across terrestrial and aquatic benchmarks
- Sustained trajectory continuity during partial LiDAR and visual sensor failures
- 49.25% error reduction from the out-of-sequence measurement correction mechanism
- Real-time performance maintained without re-integrating IMU sequences for delayed data
Why it matters
Enables reliable autonomous navigation for unmanned aerial and surface vehicles in harsh, feature-scarce wilderness and aquatic domains where conventional SLAM fails.
Abstract
Sensor degradation in unstructured natural environments—manifesting as LiDAR point cloud sparsity or visual feature dropout—and out-of-sequence measure- ment challenges critically undermine localization robustness in autonomous systems. To address these limitations, we present STEAM-LIVO, a Spatio-Temporally Adaptive Man- ifold LiDAR-Inertial-Visual Odometry framework that en- ables tightly coupled multi-sensor fusion via a spatio-temporal manifold-driven iterative Kalman filter. The proposed method formulates an error-state iterative update mechanism on Lie group manifolds, executes IMU-centric real-time estimation, and ensures resilience under sensor degradation through an incremental observation model integrating LiDAR point-to- plane geometric residuals with visual feature reprojection errors within a shared filtering framework. Comprehensive evaluations in vegetated terrestrial landscapes and dynamic aquatic surfaces demonstrate an average relative pose error of 1.77%, with sustained robustness during partial sensor failures. Rigorous ablation studies further corroborate the efficacy of our spatio-temporal adaptive manifold architecture. Our implementation is publicly available and can be accessed at https://github.com/STEAM-LIVO/STEAM-LIVO.git.