Research Analyzer
← Back ICRA 2026

Category-Level Object Shape and Pose Estimation in Less Than a Millisecond

Lorenzo Shaikewitz, Tim Nguyen, Luca Carlone

PDF

AI summary

Key figure (auto-extracted from paper)
A quaternion-based self-consistent field solver estimates category-level object shape and pose in under 100 microseconds while providing a fast certificate of global optimality.
category-level pose estimation self-consistent field iteration quaternion optimization global optimality certificate real-time robotics shape and pose estimation

Problem

Robotic applications require fast and reliable object shape and pose estimation, but existing category-level methods struggle to balance computational speed with rigorous optimality guarantees.

Approach

The authors reformulate rotation estimation as a nonlinear eigenvalue problem using quaternions and solve it efficiently with self-consistent field iteration. They complement this with a lightweight global optimality certificate derived from Lagrangian duality.

Key results

  • Self-consistent field iteration solves the nonlinear eigenproblem in ~100 microseconds
  • Fast global optimality certificate via Lagrange multipliers
  • Validated accuracy on synthetic, drone, and large-scale real-world datasets
  • Outperforms Gauss-Newton, Levenberg-Marquardt, and SDP baselines in speed

Why it matters

Enables real-time, certifiably optimal object understanding for latency-sensitive robotics tasks like autonomous driving and drone tracking.

Abstract

Object shape and pose estimation is a foun- dational robotics problem, supporting tasks from manipu- lation to scene understanding and navigation. We present a fast local solver for shape and pose estimation which requires only category-level object priors and admits an efficient certificate of global optimality. Given an RGB-D image of an object, we use a learned front-end to detect sparse, category-level semantic keypoints on the target object. We represent the target object’s unknown shape using a linear active shape model and pose a maximum a posteriori optimization problem to solve for position, orientation, and shape simultaneously. Expressed in unit quaternions, this problem admits first-order optimality conditions in the form of an eigenvalue problem with eigen- vector nonlinearities. Our primary contribution is to solve this problem efficiently with self-consistent field iteration, which only requires computing a 4 × 4 matrix and finding its minimum eigenvalue-vector pair at each iterate. Solving a linear system for the corresponding Lagrange multipliers gives a simple global optimality certificate. One iteration of our solver runs in about 100 microseconds, enabling fast outlier rejection. We test our method on synthetic data and a variety of real-world settings, including two public datasets and a drone tracking scenario.

Index terms

Optimization and Optimal Control Perception for Grasping and Manipulation RGB-D Perception

Related papers