WaveComm: Lightweight Communication for Collaborative Perception Via Wavelet Feature Distillation
Erdemt Bao, Jin Yang
AI summary
Problem
Collaborative perception systems face prohibitive communication overhead and bandwidth constraints that limit scalability and real-time performance. Existing spatial-domain compression methods often discard task-critical information or fail to efficiently balance transmission efficiency with perception accuracy.
Approach
WaveComm decomposes Bird’s-Eye-View feature maps using Discrete Wavelet Transform to isolate compact low-frequency components for transmission. At the receiver, a lightweight generator reconstructs full features from these components using a multi-scale distillation loss that preserves structural, semantic, and distributional fidelity.
Key results
- Reduces communication volume to 86.3% and 87.0% of baseline for LiDAR and camera tasks respectively
- Maintains state-of-the-art detection accuracy on OPV2V and DAIR-V2X datasets
- Introduces a Multi-Scale Distillation loss optimizing reconstruction across pixel, structural, semantic, and distributional levels
- Validates framework compatibility with existing spatial compression and feature selection strategies
Why it matters
Enables scalable, real-time multi-agent perception for connected and automated vehicles operating under strict bandwidth constraints, directly advancing autonomous driving safety and infrastructure efficiency.
Abstract
In multi-agent collaborative sensing systems, sub- stantial communication overhead from information exchange significantly limits scalability and real-time performance, es- pecially in bandwidth-constrained environments. This often results in degraded performance and reduced reliability. To address this challenge, we propose WaveComm, a wavelet- based communication framework that drastically reduces trans- mission loads while preserving sensing performance in low- bandwidth scenarios. The core innovation of WaveComm lies in decomposing feature maps using Discrete Wavelet Transform (DWT), transmitting only compact low-frequency components to minimize communication overhead. High-frequency details are omitted, and their effects are reconstructed at the receiver side using a lightweight generator. A Multi-Scale Distillation (MSD) Loss is employed to optimize the reconstruction quality across pixel, structural, semantic, and distributional levels. Ex- periments on the OPV2V and DAIR-V2X datasets for LiDAR- based and camera-based perception tasks demonstrate that WaveComm maintains state-of-the-art performance even when the communication volume is reduced to 86.3% and 87.0% of the original, respectively. Compared to existing approaches, WaveComm achieves competitive improvements in both com- munication efficiency and perception accuracy. Ablation studies further validate the effectiveness of its key components.