Low-Latency VR Telepresence for Remote Inspection in Fence-Free Collaborative Manufacturing
Stanislav Svediroh
AI summary
Problem
Existing robotic telepresence systems are proprietary, hardware-locked, and lack visibility into their latency pipelines, making it difficult to diagnose performance issues or integrate with new platforms in collaborative manufacturing.
Approach
The authors developed a robot-agnostic framework that streams hardware-accelerated stereoscopic video from a mobile robot to a Meta Quest headset, using head pose to control the robot camera and embedding per-frame timestamps to continuously measure and display pipeline latency.
Key results
- Robot-agnostic architecture requiring only a single adapter class for new platforms
- Built-in per-frame end-to-end latency instrumentation via RTP header extensions
- Glass-to-glass latency of approximately 120 ms over 5 GHz WiFi
- Runtime-reconfigurable streaming parameters accessible via an in-VR GUI
Why it matters
Provides transparent, low-latency remote inspection capabilities for safe human-robot collaboration, enabling rapid deployment across diverse mobile robot platforms.
Abstract
Fence-free collaborative manufacturing, where hu- mans and industrial machines share workspace without physical barriers, requires reliable safety monitoring. Mobile inspec- tion robots can patrol autonomously, but when anomalies are detected—such as unauthorized personnel or safety zone violations—a human operator must rapidly assess the situation through immersive remote control. We present a fully open- source VR framework enabling low-latency stereoscopic video streaming from a mobile robot to a standalone Meta Quest headset. The system supports stereo and mono video modes with hardware-accelerated encoding (H.264/H.265) on NVIDIA Jetson and hardware-accelerated decoding on the headset. Head-coupled camera control maps the operator’s gaze to the robot’s camera, providing intuitive situational awareness during remote inspection. A key contribution is built-in end-to- end latency instrumentation: per-frame timestamps embedded in RTP header extensions enable continuous monitoring of each pipeline stage from camera capture to photon emission. Measured glass-to-glass latency is approximately 120 ms over 5 GHz WiFi. The robot-agnostic architecture requires only a thin adapter layer for integration with any platform. The framework, validated on Boston Dynamics Spot, is publicly available as open source.