Research Analyzer
← Back ICRA 2026

Low-Latency VR Telepresence for Remote Inspection in Fence-Free Collaborative Manufacturing

Stanislav Svediroh

PDF

AI summary

Key figure (auto-extracted from paper)
An open-source VR framework delivers ~120 ms glass-to-glass stereoscopic telepresence from mobile robots to standalone headsets with continuous, per-frame latency monitoring.
VR telepresence low-latency streaming remote robot inspection latency instrumentation fence-free manufacturing open-source robotics

Problem

Existing robotic telepresence systems are proprietary, hardware-locked, and lack visibility into their latency pipelines, making it difficult to diagnose performance issues or integrate with new platforms in collaborative manufacturing.

Approach

The authors developed a robot-agnostic framework that streams hardware-accelerated stereoscopic video from a mobile robot to a Meta Quest headset, using head pose to control the robot camera and embedding per-frame timestamps to continuously measure and display pipeline latency.

Key results

  • Robot-agnostic architecture requiring only a single adapter class for new platforms
  • Built-in per-frame end-to-end latency instrumentation via RTP header extensions
  • Glass-to-glass latency of approximately 120 ms over 5 GHz WiFi
  • Runtime-reconfigurable streaming parameters accessible via an in-VR GUI

Why it matters

Provides transparent, low-latency remote inspection capabilities for safe human-robot collaboration, enabling rapid deployment across diverse mobile robot platforms.

Abstract

Fence-free collaborative manufacturing, where hu- mans and industrial machines share workspace without physical barriers, requires reliable safety monitoring. Mobile inspec- tion robots can patrol autonomously, but when anomalies are detected—such as unauthorized personnel or safety zone violations—a human operator must rapidly assess the situation through immersive remote control. We present a fully open- source VR framework enabling low-latency stereoscopic video streaming from a mobile robot to a standalone Meta Quest headset. The system supports stereo and mono video modes with hardware-accelerated encoding (H.264/H.265) on NVIDIA Jetson and hardware-accelerated decoding on the headset. Head-coupled camera control maps the operator’s gaze to the robot’s camera, providing intuitive situational awareness during remote inspection. A key contribution is built-in end-to- end latency instrumentation: per-frame timestamps embedded in RTP header extensions enable continuous monitoring of each pipeline stage from camera capture to photon emission. Measured glass-to-glass latency is approximately 120 ms over 5 GHz WiFi. The robot-agnostic architecture requires only a thin adapter layer for integration with any platform. The framework, validated on Boston Dynamics Spot, is publicly available as open source.

Index terms

Telerobotics and Teleoperation Virtual Reality and Interfaces Human-Robot Collaboration

Related papers