Beyond the Majority: Long-Tail Imitation Learning for Robotic Manipulation
Junhong Zhu, Ji Zhang, Jingkuan Song, Lianli Gao, Heng Tao Shen
AI summary
Problem
Generalist robot policies trained on imbalanced, long-tail demonstration datasets suffer severe performance drops on data-scarce tail tasks, while conventional re-sampling and augmentation techniques fail to address this due to a lack of variational diversity and physical plausibility.
Approach
The method isolates the target-approaching phase from data-rich head tasks, grafts objects from data-scarce tail tasks onto these trajectories, and co-trains the policy on the augmented dataset to restore spatial reasoning without external demonstrations.
Key results
- Long-tail data scarcity directly impairs spatial reasoning during target approaching
- Diagnosis of phase-wise failure modes using a new LIBERO-based benchmark
- Significant tail-task performance gains in simulation and real-world experiments
- Conventional re-sampling strategies prove ineffective for robotic policy learning
Why it matters
Enables reliable generalist robot policies for diverse real-world manipulation by solving a fundamental data imbalance problem without requiring costly new demonstrations.
Abstract
While generalist robot policies hold significant promise for learning diverse manipulation skills through im- itation, their performance is often hindered by the long-tail distribution of training demonstrations. Policies learned on such data, which is heavily skewed towards a few data-rich head tasks, frequently exhibit poor generalization when confronted with the vast number of data-scarce tail tasks. In this work, we conduct a comprehensive analysis of the pervasive long-tail challenge inherent in policy learning. Our analysis begins by demonstrating the inefficacy of conventional long-tail learning strategies (e.g., re-sampling) for improving the policy’s perfor- mance on tail tasks. We then uncover the underlying mechanism for this failure, revealing that data scarcity on tail tasks directly impairs the policy’s spatial reasoning capability. To overcome this, we introduce Approaching-Phase Augmentation (APA), a simple yet effective scheme that transfers knowledge from data-rich head tasks to data-scarce tail tasks without requiring external demonstrations. Extensive experiments in both simulation and real-world manipulation tasks demonstrate the effectiveness of APA. Our code and demos are publicly available at: https://mldxy.github.io/Project-VLA-long-tail/.