Proxy-GS: Unified Occlusion Priors for Training and Inference in Structured 3D Gaussian Splatting

1 Shanghai Artificial Intelligence Laboratory, 2 Northwestern Polytechnical University, 3 Sichuan University, 4 Hong Kong University of Science and Technology, 5 Shanghai Jiao Tong University

* Denotes Equal Contribution Corresponding author

TL;DR: We introduce Proxy-GS, an occlusion-aware training and inference framework built upon lightweight proxies.

Teaser

Our method can achieves continuous real-time rendering while achieving better visual quality in large occulated scenes.



Abstract

3D Gaussian Splatting (3DGS) has emerged as an efficient approach for achieving photorealistic rendering. Recent MLP-based variants further improve visual fidelity but introduce substantial decoding overhead during rendering. To alleviate computation cost, several pruning strategies and level-of-detail (LOD) techniques have been introduced, aiming to effectively reduce the number of Gaussian primitives in large-scale scenes. However, our analysis reveals that significant redundancy still remains due to the lack of occlusion awareness. In this work, we propose Proxy-GS, a novel pipeline that exploits a proxy to introduce Gaussian occlusion awareness from any view. At the core of our approach is a fast proxy system capable of producing precise occlusion depth maps at resolution 1000, 1000 under 1 ms. This proxy serves two roles: first, it guides the culling of anchors and Gaussians to accelerate rendering speed. Second, it guides the densification towards surfaces during training, avoiding inconsistencies in occluded regions, thus improving the rendering quality. In heavily occluded scenarios such as the MatrixCity Streets dataset, Proxy-GS achieves more than 2.5 times speedup over Octree-GS while also improving rendering quality.



Method Overview

Illustration of our proposed Proxy-GS: We first construct a lightweight proxy mesh. During rendering, hardware rasterization produces a depth map in under 1 ms, which is then used to efficiently cull anchors that are occluded. During training, in addition to the same rendering pipeline, we further introduce structure-aware anchor densification, encouraging anchors to grow adaptively along the proxy mesh geometry.



RGB Rendering and the Corresponding Decoded Anchors

In the City Street scenes, Proxy-GS achieves stable real-time rendering while preserving fine-grained visual details. Our approach substantially reduces the number of anchors that need to be decoded, leading to significant improvements in both memory efficiency and rendering speed. The top-right inset shows a top-down visualization of all anchors, where the anchors highlighted in red indicate those used by the decoder for the current frame.

Scaffold-GS
Octree-GS
Proxy-GS
Scaffold-GS
Octree-GS
Proxy-GS
Scaffold-GS
Octree-GS
Proxy-GS

Rendering Time Distribution Across Pipeline Stages

Compared to existing baselines, Proxy-GS enables virtually cost-free depth-based occlusion culling, which in turn significantly reduces the overall rendering time in large scenes with heavy occlusions, while introducing no additional burden in smaller or less-occluded environments.


Dependency on Proxy Quality and Robustness to Proxy Imperfections

Due to the characteristics of MLP-based 3DGS, the anchors and the Gaussians generated from them typically exhibit certain spatial offsets. As a result, for Proxy-GS the underlying mesh does not need to be perfectly accurate.