← Back ICRA 2026

M3Bench: Benchmarking Whole-Body Motion Generation for Mobile Manipulation in 3D Scenes

Zeyu Zhang, Sixu Yan, Muzhi Han, Zaijin Wang, Xinggang Wang, Song-Chun Zhu, Hangxin Liu

PDF

AI summary

Key figure (auto-extracted from paper)

State-of-the-art models fail to coordinate mobile base and arm motion under environmental constraints, revealing a critical gap for embodied mobile manipulation.

Mobile Manipulation Whole-body Motion Generation Embodied AI Motion Planning Benchmark Object Rearrangement

Problem

Existing methods and benchmarks treat navigation and manipulation in isolation, neglecting the need for coordinated whole-body motion and realistic physical constraints in complex 3D scenes. This gap is compounded by a severe lack of high-quality datasets for training and evaluating mobile manipulation models.

Approach

The authors introduce M3Bench, a benchmark featuring 30,000 object rearrangement tasks across 119 diverse 3D household scenes, alongside M3BenchMaker, an automated tool that generates expert whole-body motion trajectories from high-level instructions using physics simulation and kinematic optimization.

Key results

Released M3Bench benchmark with 30,000 tasks across 119 diverse 3D scenes
Developed M3BenchMaker, an automated tool for generating feasible whole-body motion data from URDFs and task instructions
Evaluated state-of-the-art planning and learning-based models, revealing their inability to effectively coordinate base-arm motion under physical constraints
Demonstrated that learning-based methods outperform modular planning in efficiency but still lag in motion accuracy for complex mobile manipulation

Why it matters

It provides a standardized, physically realistic benchmark and data generation tool to advance research on adaptive, whole-body mobile manipulation for embodied AI and robotics.

Abstract

We propose M3Bench, a new benchmark for whole- body motion generation in mobile manipulation tasks. Given a 3D scene context, M3Bench requires an embodied agent to reason about its configuration, environmental constraints, and task objectives to generate coordinated whole-body motion trajectories for object rearrangement. M3Bench features 30,000 object rearrangement tasks across 119 diverse scenes, provid- ing expert demonstrations generated by our newly developed M3BenchMaker, an automatic data generation tool that produces whole-body motion trajectories from high-level task instructions using only basic scene and robot information. Our benchmark includes various task splits to evaluate generalization across different dimensions and leverages realistic physics simulation for trajectory assessment. Extensive evaluation analysis reveals that state-of-the-art models struggle with coordinating base- arm motion while adhering to environmental and task-specific constraints, underscoring the need for new models to bridge this gap. By releasing M3Bench and M3BenchMaker at https: //zeyuzhang.com/papers/m3bench, we aim to advance robotics research toward more adaptive and capable mobile manipulation in diverse, real-world environments.

Index terms

Data Sets for Robot Learning Simulation and Animation AI-Based Methods