Research Analyzer
← Back ICRA 2026

SVR-GS: Spatially Variant Regularization for Probabilistic Masks in 3D Gaussian Splatting

Ashkan Taghipour, Vahid Naghshin, Benjamin John Southwell, Farid Boussaid, Hamid Laga, Mohammed Bennamoun

PDF

AI summary

Key figure (auto-extracted from paper)
SVR-GS cuts 3D Gaussian Splatting model size by up to 5.6× with negligible quality loss by penalizing only low-importance Gaussians through a novel per-pixel spatial mask regularizer.
3D Gaussian Splatting Probabilistic Masking Spatial Regularization Novel View Synthesis Model Pruning Real-time Rendering

Problem

Existing mask-based pruning methods apply uniform global regularization, failing to distinguish between high-impact foreground Gaussians and low-importance or occluded ones, which misaligns pruning signals with actual image quality.

Approach

The method renders a per-pixel spatial mask that weights each Gaussian’s existence probability by its visibility-driven contribution along camera rays, applying sparsity pressure precisely where it matters.

Key results

  • 1.79× and 5.63× Gaussian count reduction vs. MaskGS and 3DGS on average
  • ≤0.50 dB PSNR drop across three benchmark datasets
  • CUDA-accelerated spatial mask renderer with analytical gradient derivation
  • Smaller, faster models enabling real-time robotics and AR/VR deployment

Why it matters

Drastically reduces memory and compute requirements for 3D Gaussian Splatting, making high-fidelity novel view synthesis viable for real-time, resource-constrained applications like robotics and mobile AR/VR.

Abstract

3D Gaussian Splatting (3DGS) enables fast, high- quality novel view synthesis but relies on densification followed by pruning to optimize the number of Gaussians. Existing mask-based pruning, such as MaskGS, regularizes the global mean of the mask, which is misaligned with the local per- pixel (per-ray) reconstruction loss that determines image quality along individual camera rays. This paper introduces SVR- GS, a spatially variant regularizer that renders a per-pixel spatial mask from each Gaussian’s effective contribution along the ray, thereby applying sparsity pressure where it matters: on low-importance Gaussians. We explore three spatial-mask aggregation strategies, implement them in CUDA, and conduct a gradient analysis to motivate our final design. Extensive exper- iments on Tanks&Temples, Deep Blending, and Mip-NeRF360 datasets demonstrate that, on average across the three datasets, the proposed SVR-GS reduces the number of Gaussians by 1.79× compared to MaskGS and 5.63× compared to 3DGS, while incurring only 0.50 dB and 0.40 dB PSNR drops, respec- tively. These gains translate into significantly smaller, faster, and more memory-efficient models, making them well-suited for real-time applications such as robotics, AR/VR, and mobile perception. Additional materials are available on our project page: https://ashkantaghipour.github.io/svrgs/.

Index terms

Deep Learning Methods Deep Learning for Visual Perception RGB-D Perception

Related papers