Abstract
Millimeter-wave radar offers a promising sensing modality for autonomous systems thanks to its robustness in adverse conditions and low cost. However, its utility is significantly limited by the sparsity and low resolution of radar point clouds, which poses challenges for tasks requiring dense and accurate 3D perception. Despite that recent efforts have shown great potential by exploring generative approaches to address this issue, they often rely on dense voxel representations that are inefficient and struggle to preserve structural detail. To fill this gap, we make the key observation that latent diffusion models (LDMs), though successful in other modalities, have not been effectively leveraged for radar-based 3D generation due to a lack of compatible representations and conditioning strategies. We introduce RaLD, a framework that bridges this gap by integrating scene-level frustum-based LiDAR autoencoding, order-invariant latent representations, and direct radar spectrum conditioning. These insights lead to a more compact and expressive generation process. Experiments show that RaLD produces dense and accurate 3D point clouds from raw radar spectrums, offering a promising solution for robust perception in challenging environments.
Method Overview
Overview of the RaLD framework. Given an input radar spectrum $\radarspec{}$, RaLD aims to generate a dense and accurate 3D point clouds $\mathbf{P}\in \mathbb{R}^{N \times 3}$ that reconstruct the scene with LiDAR-like fidelity. We adopt a conditional diffusion framework that learns to synthesize point clouds conditioned on radar observations.
To achieve this, RaLD operates in a compact latent space, where a diffusion model is trained to generate point cloud embeddings guided by the radar spectrum. The overall pipeline, as illustrated below, begins with an autoencoder that compresses LiDAR point clouds into structured latent codes. A radar-conditioned latent diffusion model then samples from this space, and a decoder reconstructs the final 3D point clouds guided by radar priors.
Frustum-Based Point Cloud Auto-Encoder.
Order-Invariant Latent Coding.
Visualizations
Reconstructed 3D radar point clouds from auto-encoder.
End-to-end generated 3D radar point clouds from radar spectrums.
Quantitative Results
We evaluate our framework on the ColoRadar and SDDiff datasets, both featuring synchronized radar spectrums and LiDAR point clouds. Here we primarily highlights performance on the ColoRadar dataset.
Poster
BibTeX
@article{zhang2025rald,
title={RaLD: Generating High-Resolution 3D Radar Point Clouds with Latent Diffusion},
author={Zhang, Ruijie and Zeng, Bixin and Wang, Shengpeng and Zhou, Fuhui and Wang, Wei},
journal={arXiv preprint arXiv:2511.07067},
year={2025}
}