Research Analyzer
← Back ICRA 2026

Tightly-Coupled Dynamic Object Tracking and RGB-D Inertial Odometry Estimation with Dual Quadrics

Toyozo Shimada, Kenji Koide, Aoki Takanose, Shuji Oishi, Masashi Yokozuka, and Jun Miura

PDF

AI summary

Key figure (auto-extracted from paper)
Tightly coupling RGB-D registration, IMU data, and dual quadric object tracking in a single factor graph enables robust odometry even when dynamic objects block most of the sensor view.
Dynamic SLAM RGB-D Odometry Dual Quadrics Factor Graph Optimization Object Tracking Inertial-Visual Fusion

Problem

Standard RGB-D odometry fails in dynamic environments because moving objects create geometric outliers and reduce point cloud overlap, while naive dynamic removal discards too much data and causes estimation degeneracy.

Approach

The method represents surrounding objects as dual quadrics, continuously tracks their pose and velocity, and removes dynamic points from the depth cloud while jointly optimizing sensor and object states in a sliding-window factor graph.

Key results

  • Unified dual quadric representation for tracking arbitrary objects
  • Velocity-driven dynamic point removal integrated into factor graph optimization
  • Tight coupling of RGB-D registration, IMU preintegration, and visual bounding box constraints
  • Robust odometry performance in highly dynamic indoor scenes with heavy pedestrian traffic

Why it matters

Provides a reliable navigation foundation for autonomous robots operating in crowded, real-world settings where traditional static assumptions break down.

Abstract

This paper presents a robust RGB-D Inertial- Object odometry estimation framework for dynamic envi- ronments. The proposed method represents and tracks the surrounding objects as dual quadrics and estimates sensor pose and object parameters (pose, shape, and velocity) by jointly minimizing the depth point cloud registration factor, IMU preintegration factor, and visual object detection factor (bounding box factor). The use of the dual quadric rep- resentation enables handling detection of arbitrary objects (e.g., pedestrians, chairs and boxes) in a unified optimization framework. By continuously tracking objects across multiple frames, we identify points belonging to dynamic objects and remove them from the input point cloud. Although this point removal process inherently drops information in the input point cloud, the tight coupling of the object detection factor and the point cloud registration factor mitigates accuracy degradation. The experimental results showed that the proposed method enables robust and accurate odometry estimation in extremely dynamic situations, cases where a large part of the sensor view is occupied by walking pedestrians.

Index terms

RGB-D Perception Localization Mapping

Related papers