EMMA: Scaling Mobile Manipulation via Egocentric Human Data
Lawrence Y. Zhu, Pranav Kuppili, Ryan Punamiya, Patcharapong Aphiwetsa, Dhruv Patel, Simar Kareer, Sehoon Ha, Danfei Xu
AI summary
Problem
Scaling mobile manipulation imitation learning is bottlenecked by the high cost and scarcity of teleoperated mobile robot data, limiting diversity and deployment in unpredictable real-world settings.
Approach
EMMA bridges the human-robot embodiment gap through optimization-based navigation retargeting and coordinate-space alignment, then co-trains a unified Transformer policy on heterogeneous egocentric human mobile data and static robot manipulation data.
Key results
- Matches or exceeds task success of teleoperated baselines
- Shows positive performance scaling with increased human data
- Generalizes to novel spatial configurations and unseen scenes
- Introduces unsupervised phase identification for navigation-manipulation switching
Why it matters
Provides a scalable, low-cost paradigm for training real-world mobile manipulation robots by leveraging abundant egocentric human video data instead of expensive teleoperation.
Abstract
Scaling mobile manipulation imitation learning is bottlenecked by expensive mobile robot teleoperation. We present Egocentric Mobile MAnipulation (EMMA), an end-to-end frame- work training mobile manipulation policies from human mobile manipulation data with static robot data, sidestepping mobile teleoperation. To accomplish this, we co-train human full-body motion data with static robot data. In our experiments across four real-world tasks, EMMA demonstrates comparable performance to baselines trained on teleoperated mobile robot data (Mobile ALOHA), achieving higher or equivalent task performance in full task success. We find that EMMA is able to generalize to new spatial configurations and scenes, and we observe positive performance scaling as we increase the hours of human data, opening new avenues for scalable robotic learning in real- world environments. Details of this project can be found at: https://ego-moma.github.io