1 of 117

Volume Rendering

1

2 of 117

Volume Rendering

2

3 of 117

Where we are

2:00 - 2:30 Birds Eye View & Background - Angjoo Kanazawa

2:30 - 3:00: Deep Dive into the Volumetric Rendering Function - Ben Mildenhall

3:00 - 3:30: Encoding and Representing 3D Volumes  - Matt Tancik

3:30 - 4:00:  30 min break ☕️

4:00 - 4:30: Signal Processing Considerations - Pratul Srinivasan

4:30 - 5:00: Challenges - Angjoo Kanazawa

5:00 - 5:30: Tutorial + Q&A

4 of 117

Neural Volumetric Rendering

4

5 of 117

Neural Volumetric Rendering

5

computing color along rays through 3D space

What color is this pixel?

6 of 117

Neural Volumetric Rendering

6

continuous, differentiable rendering model without concrete ray/surface intersections

7 of 117

Neural Volumetric Rendering

7

using a neural network as a scene representation, rather than a voxel grid of data

Scene properties

8 of 117

Cameras and rays

8

9 of 117

Cameras and rays

  • We need the mathematical mapping from (camera, pixel) ray
  • Then can abstract underlying problem as learning the function ray color (the “plenoptic function”)

9

Camera

Ray

Pixel

10 of 117

Coordinate frames + Transforms: world-to-camera

World coordinates

Camera coordinates

Image coordinates

Figure credit: Peter Hedman

Extrinsics (R, T)

Orientation + Location of the camera in the World

Intrinsics (K)

How the camera maps a point in 3D to image

11 of 117

Coordinate frames + Transforms: camera-to-world

World coordinates

Camera coordinates

Image coordinates

Figure credit: Peter Hedman

Extrinsics (R, T)

Orientation + Location of the camera in the World

Intrinsics (K)

How the camera maps a point in image to 3D

12 of 117

Camera pose - pixel to camera

  • Mapping from (camera, pixel) to ray in camera coordinate frame
  • This coordinate system has camera situated at origin, with right/up/backwards aligned to x/y/z axes
    • Axis convention varies in different codebases :(
  • “Inverse intrinsic matrix” in a computer vision sense

12

Y

X

Z

13 of 117

Camera pose - pixel to camera

13

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

14 of 117

Camera pose - pixel to camera

14

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

X

Z

Z

Y

15 of 117

Camera pose - pixel to camera

15

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

distance =

distance =

16 of 117

Camera pose - pixel to camera

16

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

17 of 117

Camera pose - pixel to camera

17

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

Pixel (i, j)

distance =

distance =

Coordinates are in pixel space

18 of 117

Camera pose - intrinsic

18

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

Pixel (i, j)

If image maps to then this pixel sits at where is the image resolution.

19 of 117

Camera pose - pixel to camera

19

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

Pixel (i, j)

distance =

distance =

Recenter using pixel coordinates of image center

20 of 117

Camera pose - pixel to camera

20

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

Pixel (i, j)

Rescale frustum by focal length so that image plane is at distance 1

distance =

distance =

21 of 117

Camera pose - intrinsic

21

Y

X

Z

3D view

Top view (looking along Y)

Side view (looking along X)

Pixel (i, j)

Image maps to where focal length controls the amount of “zoom"

22 of 117

Camera pose - pixel to camera

22

Full mapping is to get 3D coordinates for a point on the image plane.

Camera space ray points from origin toward this point.

Y

X

Z

23 of 117

Camera pose - pixel to camera

  • Omitted details
    • Half-pixel offset — add 0.5 to i and j so ray precisely hits pixel center
    • This is a perfect pinhole model — typically need to add a distortion model to correct for error found in real cameras

23

24 of 117

Camera pose - camera to world

  • Simply apply rigid rotation and translation to origin and image plane points (six degrees of freedom).
  • This positions the camera in “world space”.

24

Apply rigid transformation

25 of 117

Calculating points along a ray

25

Scalar controls distance along the ray

26 of 117

Neural Volumetric Rendering

26

27 of 117

Neural Volumetric Rendering

27

continuous, differentiable rendering model without concrete ray/surface intersections

28 of 117

Surface vs. volume rendering

28

Ray

Camera

Scene representation

Want to know how ray interacts with scene

29 of 117

Surface vs. volume rendering

29

Ray

Camera

Scene representation

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

Surface rendering — loop over geometry, check for ray hits

30 of 117

Surface vs. volume rendering

30

Ray

Camera

Scene representation

Volume rendering — loop over ray points, query geometry

?

?

?

?

?

31 of 117

History of volume rendering

31

32 of 117

Early computer graphics

32

Kajiya 1984, Ray Tracing Volume Densities

Chandrasekhar 1950, Radiative Transfer

  • Theory of volume rendering co-opted from physics in the 1980s: absorption, emission, out-scattering/in-scattering
  • Adapted for visualising medical data and linked with alpha compositing
  • Modern path tracers use sophisticated Monte Carlo methods to render volumetric effects

Ray tracing simulated cumulus cloud [Kajiya]

33 of 117

Alpha compositing

33

Porter and Duff 1984, Compositing Digital Images

Alpha compositing [Porter and Duff]

  • Theory of volume rendering co-opted from physics in the 1980s: absorption, emission, out-scattering/in-scattering
  • Alpha rendering developed for digital compositing in VFX movie production
  • Modern path tracers use sophisticated Monte Carlo methods to render volumetric effects

34 of 117

Volume rendering for visualization

34

Levoy 1988, Display of Surfaces from Volume Data

Max 1995, Optical Models for Direct Volume Rendering

Kajiya 1984, Ray Tracing Volume Densities

Chandrasekhar 1950, Radiative Transfer

Porter and Duff 1984, Compositing Digital Images

  • Theory of volume rendering co-opted from physics in the 1980s: absorption, emission, out-scattering/in-scattering
  • Alpha rendering developed for digital compositing in VFX movie production
  • Volume rendering applied to visualise 3D medical scan data in 1990s

Medical data visualisation [Levoy]

35 of 117

Volume rendering for surfaces

Geometry and materials can be stored per-voxel and used with standard surface rendering methods

  • Sparse voxel octrees
  • Voxel hashing
  • Anisotropic radiative transfer

35

36 of 117

Computer vision — space carving?

36

37 of 117

Power of Analysis-by-Synthesis

  • Could’ve done 20 years ago? (different optimization method)

Kultulakos and Seitz, A Theory of Shape by Space Carving IJCV 2000

38 of 117

Volume rendering derivations

38

39 of 117

Real graphics formulation for path tracing

  • Emission, absorption, scattering
  • Simplified to the “optical model” (max)

39

40 of 117

40

Slide credit: Novak et al 2018, Monte Carlo methods for physically based volume rendering

http://commons.wikimedia.org

Absorption

http://wikipedia.org

Scattering

Emission

41 of 117

Simplify

41

Slide credit: Novak et al 2018, Monte Carlo methods for physically based volume rendering

http://commons.wikimedia.org

Absorption

http://wikipedia.org

Scattering

Emission

42 of 117

Probabilistic interpretation

42

43 of 117

Derivation of T, density eqns

43

44 of 117

Quadrature

44

45 of 117

Weight pdf is important quantity

  • Continuous and discrete (quadrature’d) versions of accumulation weights

45

46 of 117

Trivially differentiable

46

47 of 117

Expected value of non-color quantities (depth, etc)

47

48 of 117

Other stats of weight pdf (median depth?)

48

49 of 117

Volumetric formulation for NeRF

49

Scene is a cloud of tiny colored particles

Max and Chen 2010, Local and Global Illumination in the Volume Rendering Integral

50 of 117

Volumetric formulation for NeRF

50

If a ray traveling through the scene hits a particle at distance along the ray, we return its color

Camera

Ray

51 of 117

What does it mean for a ray to “hit” the volume?

51

This notion is probabilistic: chance that ray hits a particle in a small interval around is .

is called the “volume density”

52 of 117

Probabilistic interpretation

52

To determine if is the first hit along the ray, need to know : the probability that the ray makes it through the volume up to .

is called “transmittance”

53 of 117

Probabilistic interpretation

53

The product of these probabilities tells us how much you see the particles at :

54 of 117

Probabilistic interpretation

54

To determine if is the first hit along the ray, need to know : the probability that the ray doesn’t hit any particles earlier.

is called “transmittance”

We assume is known and want to use it to calculate

55 of 117

Calculating given

55

If is known, T can be computed… How?

56 of 117

Calculating given

56

and are related by the probabilistic fact that

57 of 117

Calculating transmittance

57

and are related by the probabilistic fact that

58 of 117

Calculating transmittance

58

and are related by the probabilistic fact that

59 of 117

Solve for

59

Split up differential

Rearrange

Integrate

Exponentiate

60 of 117

Solve for

60

Rearrange

Integrate

Exponentiate

Taylor expansion for T

61 of 117

Solve for

61

Taylor expansion for T

Rearrange

Integrate

Exponentiate

62 of 117

Solve for

62

Rearrange

Taylor expansion for T

Integrate

Exponentiate

63 of 117

Solve for

63

Taylor expansion for T

Rearrange

Integrate

Exponentiate

64 of 117

Solve differential equation

64

Taylor expansion for T

Rearrange

Integrate

Exponentiate

65 of 117

Density is score function!

65

An interesting observation is that density is actually the negative score function of :

This is why it doesn’t need normalization!

66 of 117

Volumetric formulation for NeRF

66

67 of 117

Volumetric formulation for NeRF

67

Food for thought #1: for a constant density medium and , we have

,

which is the exponential distribution CDF

68 of 117

Volumetric formulation for NeRF

68

Food for thought #2: From our derivation,

,

so acts as the score function of transmittance.

Interesting: depends on the current camera ray, but does not.

69 of 117

PDF for ray termination

69

Finally, we can compute the probability that a ray terminates at :

This is the product P[ray hits nothing before ] x P[ray hits something in interval ]

70 of 117

PDF for ray termination

70

Finally, we can compute the probability that a ray terminates at :

This is the product P[ray hits nothing before ] x P[ray hits something in interval ]

(This PDF is important, we will come back to it later)

71 of 117

PDF for ray termination

71

Finally, we can write the probability that a ray terminates at as a function of only sigma

FIX

72 of 117

PDF for ray termination

72

Finally, we can write the probability that a ray terminates at as a function of only sigma

73 of 117

PDF for ray termination

73

Finally, we can write the probability that a ray terminates at as a function of only sigma.

Note: the corresponding CDF is

(probability that the ray does hit something between 0 and )

74 of 117

Expected value of color along ray

74

This means the expected color returned by the ray will be

Note the nested integral!

75 of 117

Approximating the nested integral

75

We use quadrature to approximate the nested integral,

76 of 117

Approximating the nested integral

76

We use quadrature to approximate the nested integral,

splitting the ray up into segments with endpoints

77 of 117

Approximating the nested integral

77

We use quadrature to approximate the nested integral,

splitting the ray up into segments with endpoints

with lengths

78 of 117

Approximating the nested integral

78

We assume volume density and color are roughly constant within each interval

79 of 117

Deriving quadrature estimate

79

This allows us to break the outer integral into a sum of analytically tractable integrals

80 of 117

Deriving quadrature estimate

80

This allows us to break the outer integral into a sum of analytically tractable integrals

81 of 117

Deriving quadrature estimate

81

Caveat: piecewise constant density and color do not imply constant transmittance!

82 of 117

Deriving quadrature estimate

82

Caveat: piecewise constant density and color do not imply constant transmittance!

Important to account for how early part of a segment blocks later part when is high

83 of 117

Evaluating for piecewise constant density

83

For ,

We need to evaluate at continuous values that can lie partway through an interval

84 of 117

Evaluating for piecewise constant density

84

For ,

“How much light is blocked by all previous segments?”

85 of 117

Evaluating for piecewise constant density

“How much light is blocked partway through the current segment?”

For ,

86 of 117

Deriving quadrature estimate

86

87 of 117

Deriving quadrature estimate

87

Substitute

88 of 117

Deriving quadrature estimate

88

Integrate

89 of 117

Deriving quadrature estimate

89

Cancel

90 of 117

Connection to alpha compositing

90

segment opacity

91 of 117

Connection to alpha compositing

91

segment opacity

92 of 117

Connection to alpha compositing

92

93 of 117

Connection to alpha compositing

93

94 of 117

Connection to alpha compositing

94

95 of 117

Summary: volume rendering integral estimate

Rendering model for ray :

How much light is blocked earlier along ray:

How much light is contributed by ray segment i:

95

3D volume

Camera

Ray

colors

weights

96 of 117

Volume rendering is trivially differentiable

Rendering model for ray :

How much light is blocked earlier along ray:

How much light is contributed by ray segment i:

96

3D volume

Camera

Ray

colors

weights

differentiable w.r.t.

97 of 117

Further points on volume rendering

97

98 of 117

Alpha mattes and compositing

98

99 of 117

Alpha mattes and compositing

99

100 of 117

Alpha mattes and compositing

100

101 of 117

Alpha mattes and compositing

101

FIX MAKE GRAPHICS - 2 clouds and box out two halves of ray

Can compute accumulated weights over some interval along ray to get an alpha mask for compositing part of the NeRF together with other assets:

If the NeRF is an isolated object in space, then can be the full ray.

102 of 117

Alpha mattes and compositing

102

Mildenhall*, Srinivasan*, Tancik* et al 2020, NeRF

Poole et al 2022, DreamFusion

Tang et al 2022, Compressible-composable NeRF via Rank-residual Decomposition

103 of 117

Rendering weight PDF is important

103

Remember, expected color is equal to

and are “rendering weights” — probability distribution along the ray

(continuous and discrete, respectively)

104 of 117

Visual intuition — rendering weights not just 3D function

104

3D volume

Camera

Ray

104

Rendering weights are not a 3D function — depends on ray, because of tranmisttance!

105 of 117

Visual intuition — rendering weights not just 3D function

105

3D volume

Camera

Ray

105

Rendering weights are not a 3D function — depends on ray, because of tranmisttance!

106 of 117

Rendering weight PDF is important — depth

106

We can use this distribution to compute expectations for other quantities, e.g. “expected depth”:

This is often how people visualise NeRF depth maps.

Alternatively, other statistics like mode or median can be used.

107 of 117

Rendering weight PDF is important — depth

107

Mean depth

Median depth

108 of 117

Rendering weight PDF is important — depth

108

Mean depth

Median depth

109 of 117

Volume rendering other quantities

109

This idea can be used for any quantity we want to “volume render” into a 2D image. If lives in 3D space (semantic features, normal vectors, etc.)

can be taken per-ray to produce 2D output images.

110 of 117

Volume rendering other quantities

110

Kobayashi et al 2022, Decomposing NeRF for Editing via Feature Field Distillation

Various recent works have used this idea to render higher-level semantic feature maps (e.g., Feature Field Distillation and Neural Feature Fusion Fields).

111 of 117

Density as geometry

111

Normal vectors (from analytic gradient of density)

112 of 117

Note about density being 3d but trans being light-field equiv?

112

Eg autoint is light field

113 of 117

Applications/optimizing differentiable volume rendering

113

114 of 117

Alpha compositing model in ML/computer vision

114

Tulsiani et al 2017, Multi-view Supervision for Single-view Reconstruction via Differentiable Ray Consistency

Differentiable ray consistency work used a forward model with “probabilistic occupancy” to supervise 3D-from-single-image prediction.

Same rendering model as alpha compositing!

115 of 117

2015s CV/ML (Neural vols, multiplane imgs)

115

116 of 117

Volume rendering for view synthesis

116

Neural Volumes

(Lombardi et al. 2019)

Direct gradient descent to optimize an RGBA volume, regularized by a 3D CNN

Multiplane image methods

Stereo Magnification (Zhou et al. 2018)

Pushing the Boundaries… (Srinivasan et al. 2019)

Local Light Field Fusion (Mildenhall et al. 2019)

DeepView (Flynn et al. 2019)

Single-View… (Tucker & Snavely 2020)

Typical deep learning pipelines - images go into a 3D CNN, big RGBA 3D volume comes out

117 of 117

Next: how to represent scene -> Matt

117