← Back ICRA 2026

Diffusion-Based Low-Light Image Enhancement with Color and Luminance Priors

Xuanshuo Fu, Lei Kang, Javier Vazquez-Corral

PDF

AI summary

Key figure (auto-extracted from paper)

Conditioning a diffusion model on physically motivated illumination, shadow, and color priors yields state-of-the-art low-light enhancement with strong cross-dataset generalization.

Low-light image enhancement Diffusion models Illumination decomposition Color fidelity Structured priors Image restoration

Problem

Low-light images suffer from noise, low contrast, and color distortion that degrade visual quality and downstream vision tasks. Existing methods lack explicit spatial control over lighting and color, often causing artifacts and poor generalization.

Approach

A Structured Control Embedding Module decomposes inputs into illumination, reflectance, shadow, and color-invariant maps to condition a U-Net diffusion model, guiding denoising with physically motivated signals.

Key results

Achieves state-of-the-art quantitative and perceptual scores across six benchmarks
Generalizes effectively without fine-tuning when trained solely on LOLv1
Preserves texture and chromatic fidelity while adaptively boosting brightness
Integrates Retinex-inspired decomposition with diffusion sampling for controllable enhancement

Why it matters

Provides a physically grounded, highly generalizable framework for low-light enhancement that benefits nighttime photography, surveillance, and autonomous vision systems.

Abstract

Low-light images often suffer from low contrast, noise, and color distortion, degrading visual quality and impair- ing downstream vision tasks. We propose a novel conditional diffusion framework for low-light image enhancement that incorporates a Structured Control Embedding Module (SCEM). SCEM decomposes a low-light image into four informative com- ponents including illumination, illumination-invariant features, shadow priors, and color-invariant cues. These components serve as control signals that condition a U-Net–based diffusion model trained with a simplified noise-prediction loss. Thus, the proposed SCEM equipped Diffusion method enforces structured enhancement guided by physical priors. In experiments, our model is trained only on the LOLv1 dataset and evaluated without fine-tuning on LOLv2-real, LSRW, DICM, MEF, and LIME. The method achieves state-of-the-art performance in quantitative and perceptual metrics, demonstrating strong gen- eralization across benchmarks. https://casted.github.io/scem/.

Index terms

Computer Vision for Transportation Deep Learning for Visual Perception Visual Learning