Toward Multimodal Liquid-Level Estimation for Closed-Loop Robotic Pouring
Hongyu Deng, He Chen
AI summary
Problem
Reliable closed-loop robotic pouring is hindered by RGB-D cameras failing on transparent liquids and existing vision or acoustic methods suffering from high latency or indirect inference.
Approach
The authors introduce RadarEye, a fast-slow architecture that uses a physics-informed temporal tracker to process mmWave radar AoA-ToF signals, suppressing multipath interference to track the liquid surface in real time.
Key results
- 0.35 cm median error during dynamic pouring
- 0.62 ms per-update latency
- 6× and 12× error reduction over RGB-D and ultrasonic baselines
- Training-free multipath suppression via physics-informed tracking
Why it matters
Provides a robust, low-latency sensing solution for autonomous robotic manipulation of transparent liquids, critical for kitchen automation and industrial fluid handling.
Abstract
We consider the problem of real-time liquid-level estimation for closed-loop robotic pouring. To this end, we propose a fast-slow architecture where a Vision-Language Model handles high-level task reasoning and a sensor-driven fast system provides low-latency feedback. As a first instantiation of the fast system, we present RadarEye, a mmWave radar signal processing pipeline that tracks liquid level during pouring. RadarEye combines (i) AoA–ToF beamforming for liquid surface localization with (ii) a physics-informed tracker that suppresses multipath interference. In real-robot experiments, RadarEye achieves 0.35 cm median error at 0.62 ms per-update latency, outperforming vision and ultrasound baselines.