← Back ICRA 2026

DynOPETs: A Versatile Benchmark for Dynamic Object Pose Estimation and Tracking in Moving Camera Scenarios

Xiangting Meng, Jiaqi Yang, Mingshu Chen, Chenxin Yan, Yujiao Shi, Wenchao Ding, Laurent Kneip

PDF

AI summary

Key figure (auto-extracted from paper)

DynOPETs bridges the critical gap in dynamic object pose estimation by providing a large-scale real-world dataset and an efficient automated annotation pipeline for moving-camera scenarios.

Dynamic object pose estimation Moving camera scenarios Automated pose annotation RGB-D benchmark Pose graph optimization Embodied AI

Problem

Existing real-world datasets predominantly assume static cameras or stationary objects, leaving a critical gap for scenarios with simultaneous camera ego-motion and object dynamics due to the inherent difficulty of accurate pose annotation.

Approach

The authors developed a novel data acquisition and annotation pipeline that fuses absolute pose estimation, relative point tracking, and pose graph optimization to automatically generate high-quality 6-DoF pose annotations for dynamic objects captured by moving RGB-D cameras.

Key results

A novel automated annotation pipeline fusing absolute pose estimation, relative tracking, and graph optimization
DynOPETs dataset comprising 175 RGB-D sequences of dynamic objects with synchronized 6-DoF camera and object poses
Comprehensive benchmark of 19 state-of-the-art instance-level, unseen, and category-level pose estimation methods
Open-source release of the dataset, code, and annotations to accelerate research in embodied AI and AR/MR

Why it matters

Provides the first large-scale real-world benchmark for dynamic pose estimation with moving cameras, enabling researchers and developers in robotics, AR/MR, and embodied AI to train and evaluate robust models.

Abstract

In the realm of object pose estimation, scenarios involving both dynamic objects and moving cameras are preva- lent. However, the scarcity of corresponding real-world datasets significantly hinders the development and evaluation of robust pose estimation models. This is largely attributed to the inherent challenges in accurately annotating object poses in dynamic scenes captured by moving cameras. To bridge this gap, this paper presents a novel dataset DynOPETs and a dedicated data acquisition and annotation pipeline tailored for object pose estimation and tracking in such unconstrained environments. Our efficient annotation method innovatively integrates pose es- timation and pose tracking techniques to generate pseudo-labels, which are subsequently refined through pose graph optimiza- tion. The resulting dataset offers accurate pose annotations for dynamic objects observed from moving cameras. To validate the effectiveness and value of our dataset, we perform comprehensive evaluations using 19 state-of-the-art methods, demonstrating its potential to accelerate research in this challenging domain. The dataset will be made publicly available to facilitate further exploration and advancement in the field.

Index terms

Data Sets for Robotic Vision Perception for Grasping and Manipulation Object Detection Segmentation and Categorization