1 of 23

2D Gaussian Splatting for Geometrically Accurate Radiance Fields

ShanghaiTech University and Tübingen AI Center

2 of 23

Gaussian Splatting

  • Models a 3D scene using a large number of Gaussians, each with learned mean, covariance, and opacity
  • Very fast rendering and can render from all different viewpoints and directions
  • Learning process alternates between gradient descent on reconstruction loss of specific pixels, duplicating Gaussians in under-reconstructed regions, and splitting Gaussians when variance becomes too large

3 of 23

3D Gaussian Splatting

Color

Surface Normals

4 of 23

Issue with 3dGS

  • A lot of what we see visually is made up of surfaces. Objects are solid, but only the flat surface is seen, e.g. tabletops and leaves
  • 3d Gaussians have trouble representing flat surfaces because they are more spherical and less flat
  • Furthermore, sometimes the extra dimension that is not supposed to be there interferes with reconstruction and learning from other view angles
  • Viewing 3d geometry from different angles meaning it is no longer perceptually uniform, i.e. when representing a surface with multiple Gaussians, their shading does not change in the same way when rotating the viewpoint

5 of 23

Issue with 3dGS

6 of 23

Issue with 3dGS

Color

Surface Normals

7 of 23

Related Works

  • Zwicker et al. 2001a, b introduces 2d surface elements (surfels) to approximate object surfaces locally
    • Requires some prior on ground truth geometry, which is usually not given, and some desired shapes are difficult to make differentiable
  • Niemeyer et al. 2020 introduces neural approach, i.e. learn a network from 3d to scalar where the surface is defined when the function equals 0
    • Surface is extracted by the cube marching algorithm, where voxels are approximated by evaluating the function at the corner and midpoints and searching a fixed codebook of surfaces
  • Volume rendering, e.g. NeRF samples points along rays
  • Existing methods do not scale well, average around 128 GPU hours to learn a single unbounded scene

8 of 23

2D Gaussian Splatting

  • Note that 2d Gaussian splatting is actually a subcase of 3d splatting where the third dimension is constrained to be zero variance, i.e. Dirac delta distribution
  • Can very easily model surfaces, even curved surfaces with sufficiently many Gaussians
  • Does not suffer from curse of dimensionality. I.e. 3d Gaussian that is occluding a ray will inhibit training when viewed from that particular direction, a 2d Gaussian will learn to rotate to avoid the ray

9 of 23

2D Gaussian Splatting

Color

Surface Normals

10 of 23

2D Gaussian Splatting

Color

Surface Normals

11 of 23

Method

12 of 23

Training

  • Huang et al. uses two regularizers to address issues not found in 3dGS
  • Depth distortion - 2d surfaces by nature tend to be much more sparse (i.e. take up much less volume) than 3d objects
    • E.g. if we have a cube that we want to model and render, then we only care about the 6 surfaces, not whether it is hollow or not
    • The Gaussians that comprise the same surfaces should be close together
  • Normal consistency - 2d splats in a local neighborhood should be aligned and form a coherent surface
    • Want their normals to be in similar directions

13 of 23

Depth Distortion

  • Goal is to concentrate the weight distribution of the rays by minimizing distances between ray-splat intersections

  • Distance between ray-splat intersections, weighted by “importance” of the splats relative to the ray

14 of 23

Normal Consistency

  • Want to make sure that splats are locally aligned, but during initial stages of training when most splats are still semi-transparent, it is not clear which ones are important
  • Goal is to align to the “median” point of intersection, align “most of the splats”
  • Towards the end of training, just the foremost surface

  • Why go through so much work to compute median normal? Usually this is just the normal of the median splat

15 of 23

Normal Consistency

16 of 23

Training

  • Every 3000 gradient descent steps remove splats with opacity lower than 0.05
  • Both under and over-reconstructed regions have large positional gradients that attempt to move the Gaussians in order to fix this, so both are detected with a gradient threshold of 0.0002
  • In under-reconstructed regions, the splat is duplicated and moved in the direction of the positional gradient
  • In over-reconstructed regions, a large splat is duplicated and scaled down by 1.6

17 of 23

Rendering

  • Renders very quickly because Gaussians now only intersect on a line, can render by taking the point of intersection between the ray and splat
  • When viewing a 2d Gaussian from the side, degenerates into an infinite line so they use a low-pass filter so that a line will always be rendered as a very thin ellipse

18 of 23

Rendering

  • To extract meshes, render depth maps from training views, and use truncated signed distance fusion (TSDF) to fuse the depth maps, with a voxel grid of size 0.004 and run cube marching
  • 2dGS can provide a good representation for the scene but does not necessarily have the best rendering for all view angles, so convert to a mesh
  • Mesh with triangulation can also more efficiently represent straight edges

19 of 23

Results

20 of 23

Comparison with other Gaussian Splat Methods

21 of 23

Note: SuGaR is a 3dGS method that tries to encourage Gaussians to be more flat and align with the density gradient, i.e. roughly align with the “surface” formed by neighborhood Gaussians

22 of 23

Effect of Regularizations

23 of 23

Questions?