1 of 48

RT-NeRF: Real-Time On-Device Neural Radiance Fields� Towards Immersive AR/VR Rendering

Chaojian Li, Sixu Li, Yang Zhao, Wenbo Zhu, and Yingyan (Celine) Lin

Georgia Institute of Technology

Efficient and Intelligent Computing Lab

2 of 48

NeRF as a Tool to Generate Novel Views

Neural Radiance Fields (NeRF) can generate arbitrary new views of a specific scene given sparsely sampled scene images

Video source: youtu.be/HfJpQCBTqZs

Inputs: Sparsely sampled views

Outputs: Images of any new view

3 of 48

SOTA Efficient NeRF’s Pipeline: How Does It Work?

Volume rendering 1) emits a ray from the origin of the view for each pixel and 2) aggregates the queried features of points along the ray

Video source: [Mildenhall et. al., ECCV’20]

4 of 48

Real-Time NeRF Is Increasingly Demanded

Real-Time NeRF can enhance numerous applications and features

Virtual Meetings

Metaverse

Autonomous Driving

Simulation

Source: shorturl.at/gCFMW

Source: shorturl.at/kmnvZ

Source: shorturl.at/fGUY7

…

5 of 48

SOTA Efficient NeRF Desired Real-Time NeRF

6 GB

FastNeRF [Garbin et. al., ICCV’21]

TensoRF [Chen et. al., ECCV’22]

Our RT-NeRF

NeRF [Mildenhall et. al., ECCV’20]

Limitation 1: Large memory requirement

Require > 54 GB for caching intermediate results [Garbin et. al., ICCV’21]
Oculus Quest 2 VR headset has only 6 GB memory

Memory cost

Techniques

6 of 48

SOTA Efficient NeRF Desired Real-Time NeRF

Limitation 2: Low throughput

Require > 30 FPS to enable real-time immersive interactions
Rendering 800x800 images on Edge GPU

can only achieve 0.01 FPS [Chen et. al., ECCV’22]

Memory cost

6 GB

Techniques

Limitation 1: Large memory requirement

Require > 54 GB for caching intermediate results [Garbin et. al., ICCV’21]
Oculus Quest 2 VR headset has only 6 GB memory

Throughput

30 FPS

Techniques

FastNeRF [Garbin et. al., ICCV’21]

TensoRF [Chen et. al., ECCV’22]

Our RT-NeRF

NeRF [Mildenhall et. al., ECCV’20]

7 of 48

Contribution 1: Analyze the Efficiency Bottlenecks

Runtime breakdown on a SOTA efficient NeRF solution [Chen et. al., ECCV’22]

Source: [Mildenhall et. al., ECCV’20]

100%

75%

50%

25%

0%

Map pixels to rays

8 of 48

Contribution 1: Analyze the Efficiency Bottlenecks

Runtime breakdown on a SOTA efficient NeRF solution [Chen et. al., ECCV’22]

Source: [Mildenhall et. al., ECCV’20]

100%

75%

50%

25%

0%

Query the features of points along the rays

9 of 48

Contribution 1: Analyze the Efficiency Bottlenecks

Runtime breakdown on a SOTA efficient NeRF solution [Chen et. al., ECCV’22]

Source: [Mildenhall et. al., ECCV’20]

100%

75%

50%

25%

0%

Render pixels’ colors

10 of 48

Contribution 2: Identify Two Key Bottlenecks

Dominant step: Query the features of points along the rays

Bottleneck 1 - Locate pre-existing points

Bottleneck 1

11 of 48

Contribution 2: Identify Two Key Bottlenecks

Dominant step: Query the features of points along the rays

Bottleneck 1 - Locate pre-existing points
Bottleneck 2 - Compute points’ embeddings

Bottleneck 1

Bottleneck 2

12 of 48

Zoom-in Bottleneck 1 (Locate Pre-Existing Points)

Existing works [Chen et. al., ECCV’22]:

Query points’ existence based on a 3D binary occupancy grid

Skip the following steps

Continue the following steps

1) If zero

2) If non-zero

13 of 48

Zoom-in Bottleneck 1 (Locate Pre-Existing Points)

Existing works [Chen et. al., ECCV’22]:

Query points’ existence based on a 3D binary occupancy grid

Skip the following steps

Continue the following steps

1) If zero

2) If non-zero

We identify the corresponding cause: Irregular accesses to the occupancy grid

because rays can come from any direction

14 of 48

+

×

Scalar Multiplication

Decompose to matrix-vector pairs

Zoom-in Bottleneck 2 (Compute Points’ Embeddings)

Existing works [Chen et. al., ECCV’22]:

Fetch the embeddings from a 3D decomposed grid

15 of 48

+

×

Scalar Multiplication

Decompose to matrix-vector pairs

Zoom-in Bottleneck 2 (Compute Points’ Embeddings)

Existing works [Chen et. al., ECCV’22]:

Fetch the embeddings from a 3D decomposed grid

We identify the corresponding cause: The sparse decomposed embedding grid

is treated as a dense one, i.e., the sparsity was not leveraged.

16 of 48

Overview of the Proposed RT-NeRF

To Alleviate Bottleneck 1

(Locate the Pre-Existing Points)

Only query points in the cube

Propose a New Efficient Rendering Pipeline

17 of 48

Propose a Hybrid Sparse Encoding Scheme & Bi-Direction Trees

To Alleviate Bottleneck 1

(Locate the Pre-Existing Points)

To Alleviate Bottleneck 2

(Compute Points’ Embeddings)

Only query points in the cube

Bitmap-based

Coordinate-based

Denser

Sparser

Propose a New Efficient Rendering Pipeline

Overview of the Proposed RT-NeRF

18 of 48

RT-NeRF’s Algorithm Contribution

Propose a Hybrid Sparse Encoding Scheme & Bi-Direction Trees

To Alleviate Bottleneck 1

(Locate the Pre-Existing Points)

To Alleviate Bottleneck 2

(Compute Points’ Embeddings)

Only query points in the cube

Bitmap-based

Coordinate-based

Denser

Sparser

Propose a New Efficient Rendering Pipeline

19 of 48

Contribution 3: Efficient Rendering Pipeline

Loop over the non-zeros of the occupancy grid instead of rays to be rendered, utilizing the corresponding 3D geometry

Only query points in the cube

20 of 48

Regular accesses to the stored non-zero cubes’ location

Irregular accesses

…

Stored non-zero cubes’ location

Existing rendering pipeline [Chen et. al., ECCV’22]

Our proposed efficient rendering pipeline

Regular accesses

…

Our Proposed Efficient Rendering Pipeline : Motivation

21 of 48