Class-Agnostic Robotic Gaze Control Via Fast Normalized Cut
Andrej Lucny, Branislav Zigo, Igor Farka�
AI summary
Problem
Robotic gaze control often depends on computationally intensive or class-specific detection models, hindering real-time deployment and generalization to arbitrary objects.
Approach
The method recursively splits deep feature maps using a linear-scaling Fast Normalized Cut algorithm, halting splits when feature similarity exceeds a threshold to produce object masks for gaze tracking.
Key results
- Achieves 6 fps gaze tracking on a standard gaming notebook
- Generates fine-grained, class-agnostic object masks in real time
- Provides stable object descriptors for distinguishing multiple items
- Demonstrates effective robot head following without predefined object classes
Why it matters
Enables efficient, hardware-light robotic perception that generalizes to any visual object, advancing real-time autonomous navigation and human-robot interaction.
Abstract
We present an application of a new algorithm for estimating the minimal normalized cut to the control of robotic gaze. We recursively apply the bipartition of the feature map provided by a foundation model, measuring when to stop and return object masks. We find this approach useful, stable, and capable of running in real time.