← Back ICRA 2026

MSDNet: Efficient 4D Radar Super-Resolution Via Multi-Stage Distillation

Minqing Huang, Shouyi Lu, Boyuan Zheng, Ziyao Li, Xiao Tang, Guirong Zhuo

PDF

AI summary

Key figure (auto-extracted from paper)

MSDNet efficiently transforms sparse, noisy 4D radar point clouds into dense, high-fidelity representations using a two-stage knowledge distillation framework with low inference latency.

4D radar super-resolution knowledge distillation diffusion models autonomous driving point cloud enhancement

Problem

4D radar point clouds are inherently sparse and noisy, limiting their use in fine-grained autonomous perception, while existing super-resolution methods suffer from high training costs, complex diffusion sampling, or poor generalization.

Approach

The method transfers dense LiDAR geometric priors to 4D radar features through reconstruction-guided distillation, then refines them with a lightweight diffusion network and an adaptive noise alignment module.

Key results

First knowledge distillation framework applied to 4D radar super-resolution
Achieves high-fidelity reconstruction with significantly reduced inference latency
Outperforms existing methods on reconstruction metrics across VoD and in-house datasets
Delivers substantial performance gains on downstream autonomous driving tasks

Why it matters

Enables reliable, real-time 4D radar perception for autonomous driving in adverse weather without the computational overhead of traditional diffusion models.

Abstract

4D radar super-resolution, which aims to recon- struct sparse and noisy point clouds into dense and geomet- rically consistent representations, is a foundational problem in autonomous perception. However, existing methods often suffer from high training cost or rely on complex diffusion-based sampling, resulting in high inference latency and poor general- ization, making it difficult to balance accuracy and efficiency. To address these limitations, we propose MSDNet, a multi-stage distillation framework that efficiently transfers dense LiDAR priors to 4D radar features to achieve both high reconstruction quality and computational efficiency. The first stage performs reconstruction-guided feature distillation (RGFD), aligning and densifying the student’s features through feature reconstruction. In the second stage, we propose diffusion-guided feature dis- tillation (DGFD), which treats the stage-one distilled features as a noisy version of the teacher’s representations and refines them via a lightweight diffusion network. Furthermore, we introduce a noise adapter that adaptively aligns the noise level of the feature with a predefined diffusion timestep, enabling a more precise denoising. Extensive experiments on the VoD and in-house datasets demonstrate that MSDNet achieves both high-fidelity reconstruction and low-latency inference in the task of 4D radar point cloud super-resolution, and consistently improves performance on downstream tasks.

Index terms

Computer Vision for Transportation Sensor Fusion Localization