Ep#50: EMMA: Scaling Mobile Manipulation via Egocentric Human Data

Description

Collecting robot teleoperation data for mobile manipulation is incredibly time consuming, even moreso than collecting teleoperation data for a stationary mobile manipulator. Fortunately, Lawrence and Pranav have a solution: EMMA, or Egocentric Mobile MAnipulation.

In short, they find that they can skip mobile teleoperation entirely, just using static arms for manipulation tasks and co-training with egocentric human video. This is enough to show generalization to more complex scenes and tasks.

To learn more, watch Episode #50 of RoboPapers now, hosted by Michael Cho and Chris Paxton!

Abstract:

Scaling mobile manipulation imitation learning is bottlenecked by expensive mobile robot teleoperation. We present Egocentric Mobile MAnipulation (EMMA), an end-to-end framework training mobile manipulation policies from human mobile manipulation data with static robot data, sidestepping mobile teleoperation. To accomplish this, we co-train human full-body motion data with static robot data. In our experiments across three real-world tasks, EMMA demonstrates comparable performance to baselines trained on teleoperated mobile robot data (Mobile ALOHA), achieving higher or equivalent task performance in full task success. We find that EMMA is able to generalize to new spatial configurations and scenes, and we observe positive performance scaling as we increase the hours of human data, opening new avenues for scalable robotic learning in real-world environments. Details of this project can be found at this https URL.

Project Page: https://ego-moma.github.io/

ArXiV: https://arxiv.org/abs/2509.04443

Original Thread on X

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit robopapers.substack.com

Ep#50: EMMA: Scaling Mobile Manipulation via Egocentric Human Data

Listen

Description

Want to check another podcast?