← Back ICRA 2026

M3CAD: Towards Generic Cooperative Autonomous Driving Benchmark

Morui Zhu, Yongqi Zhu, Yihao Zhu, Qi Chen, Deyuan Qu, Song Fu, Qing Yang

PDF

AI summary

Key figure (auto-extracted from paper)

M3CAD introduces a comprehensive multi-vehicle cooperative driving benchmark and a bandwidth-adaptive fusion method that drastically cuts communication costs without sacrificing perception accuracy.

Cooperative autonomous driving Multi-vehicle benchmark Multi-level fusion Sim-to-real transfer Communication-efficient perception Autonomous driving tasks

Problem

Existing cooperative driving datasets are limited in scale, task diversity, and real-world applicability, while current perception methods rely on dense feature fusion that incurs prohibitive communication costs for real-world deployment.

Approach

The authors release M3CAD, a large-scale simulated benchmark supporting multi-vehicle, multi-task, and multi-modality cooperative driving research, alongside a multi-level fusion framework that dynamically selects between dense BEV features, compact queries, and sparse reference points based on available network bandwidth.

Key results

M3CAD benchmark with 204 sequences, 30k frames, and annotations for six core autonomous driving tasks
Multi-level fusion framework adaptively balancing communication efficiency and perception accuracy
Reference point fusion reduces bandwidth by over 99% while maintaining near-optimal tracking and planning accuracy
Sim-to-real transfer validation showing M3CAD pre-training boosts real-world performance with only 10% of nuScenes data

Why it matters

Offers the research community a scalable, realistic platform to develop and evaluate bandwidth-efficient cooperative autonomous driving systems that bridge the sim-to-real gap.

Abstract

We introduce M3CAD, a comprehensive bench- mark designed to advance research in generic cooperative autonomous driving. M3CAD comprises 204 sequences with 30,000 frames. Each sequence includes data from multiple vehicles and different types of sensors, e.g., LiDAR point clouds, RGB images, and GPS/IMU, supporting a variety of autonomous driving tasks, including object detection and tracking, mapping, motion forecasting, occupancy prediction, and path planning. This rich multimodal setup enables M3CAD to support both single-vehicle and multi-vehicle cooperative autonomous driving research. To the best of our knowledge, M3CAD is the most complete benchmark specifically designed for cooperative, multi-task autonomous driving research. To test its effectiveness, we use M3CAD to evaluate both state-of- the-art single-vehicle and cooperative driving solutions, setting baseline performance results. Since most existing cooperative perception methods focus on merging features but often ignore network bandwidth requirements, we propose a new multi-level fusion approach which adaptively balances communication ef- ficiency and perception accuracy based on the current network conditions. We release M3CAD, along with the baseline models and evaluation results, to support the development of robust cooperative autonomous driving systems. All resources will be made publicly available on our project webpage.

Index terms

Intelligent Transportation Systems Cooperating Robots Data Sets for Robot Learning