Human-In-The-Loop Gaussian Splatting for Robotic Teleoperation
Yongseok Lee, Hyunsu Kim, Harim Ji, Jinuk Heo, Youngseon Lee, Jiseock Kang, Jeongseob Lee, Dongjun Lee
AI summary
Problem
Conventional teleoperation relies on narrow-field live camera feeds that lack depth cues and spatial context, while fully autonomous 3D reconstruction struggles to safely and efficiently collect multi-view data in dynamic, cluttered spaces.
Approach
HIL-GS creates a closed loop where a human operator inspects a live 3D Gaussian map in VR and uses finger gestures to safely guide the robot to informative viewpoints, while proprioceptive sensors stabilize the reconstruction during aggressive motions.
Key results
- Drift-free 3DGS mapping via RGB-D and proprioceptive sensor fusion
- Real-time VR display with collision warnings and unobserved-region overlays
- Intuitive finger-based VR interface for predictive robot motion selection
- Superior reconstruction quality, usability, and teleoperation efficiency in simulation and real-world tests
Why it matters
Provides teleoperators with rich 3D spatial context and intuitive control, significantly improving safety and precision in hazardous or cluttered remote manipulation tasks.
Abstract
Safe, precise teleoperation demands a third-person 3D view that reveals collision clearances and task-critical geometry in full detail. Yet most systems still rely on live camera streams that offer tunnel-vision perspectives and weak depth cues, hiding hazards and denying operators the spatial context for precise manipulation. 3D Gaussian Splatting (GS) renders photoreal- istic views in real time, yet safe, efficient multi-view acquisi- tion in cluttered teleoperation remains a bottleneck. We propose Human-in-the-Loop Gaussian Splatting (HIL-GS) that delivers safe, robust, and efficient 3D scene reconstruction for challenging teleoperation environments. HIL-GS combines three modules in a tightly-coupled loop: (1) motion-aware GS reconstruction that fuses RGB-D and proprioceptive sensors for drift-free and robust mapping under aggressive motions; (2) VR-based informative dis- play that renders the GS map with contextual overlays/feedback in real time to ensure situational awareness and reconstruction completeness; and (3) finger-based control interface to guide the robot toward informative viewpoints through safe, non-redundant motions. Through simulation and real-world experiments, we demonstrate that HIL-GS outperforms traditional approaches in reconstruction quality, usability, and efficiency.