Hallucinating 360°: Panoramic Street-View Generation Via Local Scenes Diffusion and Probabilistic Prompting
Fei Teng, Kai Luo, Sheng Wu, Siyu Li, PuJun Guo, Jiale Wei, Jiaming Zhang, Kunyu Peng, Kailun Yang
AI summary
Problem
Autonomous driving lacks large-scale, high-quality panoramic datasets due to expensive data collection, while existing generative models inherit stitching errors and information loss from pinhole cameras, failing to produce coherent panoramic views.
Approach
Percep360 uses a Local Scenes Diffusion Method to spatially bridge stitching gaps and a Probabilistic Prompting Method to dynamically select control cues, enabling coherent and controllable panoramic generation from imperfect inputs.
Key results
- First framework for coherent panoramic street-view generation
- LSDM compensates for pinhole sampling information loss and stitching misalignments
- PPM maintains strong controllability across diverse spatial and semantic prompts
- Synthetic data improves downstream panoramic BEV segmentation mIoU by 2.5%
Why it matters
Provides a cost-effective data synthesis paradigm that accelerates the development of robust autonomous driving perception systems by overcoming panoramic data scarcity.
Abstract
Panoramic perception holds significant potential for autonomous driving, enabling vehicles to acquire a com- prehensive 360° surround view in a single shot. However, autonomous driving is a data-driven task. Complete panoramic data acquisition requires complex sampling systems and anno- tation pipelines, which are time-consuming and labor-intensive. Although existing street view generation models have demon- strated strong data regeneration capabilities, they can only learn from the fixed data distribution of existing datasets and cannot leverage noisy ground truth as a supervisory signal. In this paper, we propose the first panoramic generation method Percep360 for autonomous driving. Percep360 enables coherent generation of panoramic data with control signals based on the stitched panoramic data. Percep360 focuses on two key aspects: coherence and controllability. Specifically, to overcome the inherent information loss caused by the pinhole sampling process, we propose the Local Scenes Dif- fusion Method (LSDM). LSDM reformulates the panorama generation as a spatially continuous diffusion process, bridging the gaps between different data distributions. Additionally, to achieve the controllable generation of panoramic images, we propose a Probabilistic Prompting Method (PPM). PPM dynamically selects the most relevant control cues, enabling controllable panoramic image generation. We evaluate the effectiveness of the generated images from three perspectives: image quality assessment (i.e., no-reference and with reference), controllability, and their utility in real-world Bird’s Eye View (BEV) segmentation. Notably, the generated data consistently outperforms the original stitched images in no-reference quality metrics and enhances downstream perception models, leading to an improvement of 2.5% in mIoU for panoramic BEV segmentation. The source code will be publicly available at https://github.com/FeiT-FeiTeng/Percep360.