← Back ICRA 2026

RoboMatch: A Unified Mobile-Manipulation Teleoperation Platform with Auto-Matching Network Architecture for Long-Horizon Tasks

Hanyu Liu, Yunsheng Ma, Jiaxin Huang, Keqiang Ren, Jiayi Wen, Yilin Zheng, Haoru Luan, Baishu Wan, Pan Li, Jiejun Hou, Zhihua Wang, Zhigong Song

PDF

AI summary

Key figure (auto-extracted from paper)

RoboMatch unifies mobile-manipulation teleoperation and introduces an auto-matching network that boosts long-horizon task success by up to 30% and inference performance by 40%.

Mobile manipulation Teleoperation Diffusion policy Long-horizon tasks Auto-matching network Imitation learning

Problem

Current teleoperation platforms lack synchronized mobile-manipulation control and sufficient sensory feedback, while end-to-end models struggle with error accumulation and limited reasoning in long-horizon tasks.

Approach

The authors introduce a unified cockpit-style teleoperation platform enhanced with a Proprioceptive-Visual Enhanced Diffusion Policy for precise manipulation and an Auto-Matching Network that decomposes long-horizon tasks into subtasks routed to specialized lightweight models.

Key results

20% increase in data collection efficiency via unified cockpit interface
PVE-DP improves task success rates by 20–30% through spatio-frequency visual fusion and IMU-enhanced proprioception
AMN boosts long-horizon inference performance by ~40% via dynamic subtask routing to specialized policies

Why it matters

Offers a scalable, high-precision framework for complex mobile manipulation and long-horizon task execution, advancing real-world deployment of imitation learning and teleoperation systems.

Abstract

This paper presents RoboMatch, a novel uni- fied teleoperation platform for mobile manipulation with an auto-matching network architecture, designed to tackle long- horizon tasks in dynamic environments. Our system enhances teleoperation performance, data collection efficiency, task ac- curacy, and operational stability. The core of RoboMatch is a cockpit-style control interface that enables synchronous operation of the mobile base and dual arms, significantly improving control precision and data collection. Moreover, we introduce the Proprioceptive-Visual Enhanced Diffusion Policy (PVE-DP), which leverages Discrete Wavelet Trans- form (DWT) for multi-scale visual feature extraction and integrates high-precision IMUs at the end-effector to enrich proprioceptive feedback, substantially boosting fine manipula- tion performance. Furthermore, we propose an Auto-Matching Network (AMN) architecture that decomposes long-horizon tasks into logical sequences and dynamically assigns lightweight pre-trained models for distributed inference. Experimental results demonstrate that our approach improves data collec- tion efficiency by over 20%, increases task success rates by 20–30% with PVE-DP, and enhances long-horizon inference performance by approximately 40% with AMN, offering a robust solution for complex manipulation tasks. Project website: https://robomatch.github.io

Index terms

AI-Enabled Robotics Imitation Learning Engineering for Robotic Systems