1 of 23

2D Gaussian Splatting for Geometrically Accurate Radiance Fields

ShanghaiTech University and Tübingen AI Center

2 of 23

Gaussian Splatting

Models a 3D scene using a large number of Gaussians, each with learned mean, covariance, and opacity
Very fast rendering and can render from all different viewpoints and directions
Learning process alternates between gradient descent on reconstruction loss of specific pixels, duplicating Gaussians in under-reconstructed regions, and splitting Gaussians when variance becomes too large

3 of 23

3D Gaussian Splatting

Color

Surface Normals

4 of 23

Issue with 3dGS

A lot of what we see visually is made up of surfaces. Objects are solid, but only the flat surface is seen, e.g. tabletops and leaves
3d Gaussians have trouble representing flat surfaces because they are more spherical and less flat
Furthermore, sometimes the extra dimension that is not supposed to be there interferes with reconstruction and learning from other view angles
Viewing 3d geometry from different angles meaning it is no longer perceptually uniform, i.e. when representing a surface with multiple Gaussians, their shading does not change in the same way when rotating the viewpoint

5 of 23

Issue with 3dGS

6 of 23

Issue with 3dGS

Color

Surface Normals

7 of 23

Related Works

Zwicker et al. 2001a, b introduces 2d surface elements (surfels) to approximate object surfaces locally

Requires some prior on ground truth geometry, which is usually not given, and some desired shapes are difficult to make differentiable

Niemeyer et al. 2020 introduces neural approach, i.e. learn a network from 3d to scalar where the surface is defined when the function equals 0

Surface is extracted by the cube marching algorithm, where voxels are approximated by evaluating the function at the corner and midpoints and searching a fixed codebook of surfaces

Volume rendering, e.g. NeRF samples points along rays
Existing methods do not scale well, average around 128 GPU hours to learn a single unbounded scene

8 of 23

2D Gaussian Splatting

Note that 2d Gaussian splatting is actually a subcase of 3d splatting where the third dimension is constrained to be zero variance, i.e. Dirac delta distribution
Can very easily model surfaces, even curved surfaces with sufficiently many Gaussians
Does not suffer from curse of dimensionality. I.e. 3d Gaussian that is occluding a ray will inhibit training when viewed from that particular direction, a 2d Gaussian will learn to rotate to avoid the ray

9 of 23

2D Gaussian Splatting

Color

Surface Normals

10 of 23

2D Gaussian Splatting

Color

Surface Normals

12 of 23

Training

Huang et al. uses two regularizers to address issues not found in 3dGS
Depth distortion - 2d surfaces by nature tend to be much more sparse (i.e. take up much less volume) than 3d objects

E.g. if we have a cube that we want to model and render, then we only care about the 6 surfaces, not whether it is hollow or not
The Gaussians that comprise the same surfaces should be close together

Normal consistency - 2d splats in a local neighborhood should be aligned and form a coherent surface

Want their normals to be in similar directions

13 of 23

Depth Distortion

Goal is to concentrate the weight distribution of the rays by minimizing distances between ray-splat intersections

Distance between ray-splat intersections, weighted by “importance” of the splats relative to the ray

14 of 23

Normal Consistency

Want to make sure that splats are locally aligned, but during initial stages of training when most splats are still semi-transparent, it is not clear which ones are important
Goal is to align to the “median” point of intersection, align “most of the splats”
Towards the end of training, just the foremost surface

Why go through so much work to compute median normal? Usually this is just the normal of the median splat

15 of 23

Normal Consistency

16 of 23

Training

Every 3000 gradient descent steps remove splats with opacity lower than 0.05
Both under and over-reconstructed regions have large positional gradients that attempt to move the Gaussians in order to fix this, so both are detected with a gradient threshold of 0.0002
In under-reconstructed regions, the splat is duplicated and moved in the direction of the positional gradient
In over-reconstructed regions, a large splat is duplicated and scaled down by 1.6

17 of 23

Rendering

Renders very quickly because Gaussians now only intersect on a line, can render by taking the point of intersection between the ray and splat
When viewing a 2d Gaussian from the side, degenerates into an infinite line so they use a low-pass filter so that a line will always be rendered as a very thin ellipse

18 of 23

Rendering

To extract meshes, render depth maps from training views, and use truncated signed distance fusion (TSDF) to fuse the depth maps, with a voxel grid of size 0.004 and run cube marching
2dGS can provide a good representation for the scene but does not necessarily have the best rendering for all view angles, so convert to a mesh
Mesh with triangulation can also more efficiently represent straight edges

20 of 23

Comparison with other Gaussian Splat Methods

21 of 23

Note: SuGaR is a 3dGS method that tries to encourage Gaussians to be more flat and align with the density gradient, i.e. roughly align with the “surface” formed by neighborhood Gaussians

1 of 23

2 of 23

3 of 23

4 of 23

5 of 23

6 of 23

7 of 23

8 of 23

9 of 23

10 of 23

11 of 23

12 of 23

13 of 23

14 of 23

15 of 23

16 of 23

17 of 23

18 of 23

19 of 23

20 of 23

21 of 23

22 of 23

23 of 23