1 of 36

Deep scene-scale material estimation�from multi-view indoor captures

Computers & Graphics

Siddhant Prakash Gilles Rainer

Adrien Bousseau George Drettakis

2 of 36

2

Rendering Equation [Kajiya 1986]

3 of 36

3

Bi-directional Reflectance Distribution Function (BRDF)

Normal

Diffuse Albedo

Specular Albedo

Roughness

Emitters

Rendering Equation [Kajiya 1986]

4 of 36

Material Estimation

Estimate approximate material maps

4

Diffuse Albedo

Roughness

Specular Albedo

Input Image

Our method

5 of 36

Material Estimation

Estimate approximate material maps, given multi-view image stack and geometry

5

Diffuse Albedo

Roughness

Specular Albedo

Diffuse Albedo

Roughness

Specular Albedo

Input Image

Our method

6 of 36

Estimate approximate material maps, given multi-view image stack and geometry

To re-render the scene with edited lighting, objects, or materials

6

Diffuse Albedo

Input Image

Material Estimation

Roughness

Specular Albedo

Edited Image

7 of 36

Key Ideas

  • Multi-view input
    • Visual cues aid material prediction network
  • Texture-space gathering
    • Unique locally consistent material per surface point
  • Supervised learning with photorealistic dataset
    • Direct supervision with ground truth maps
    • Local lighting model in rendering loss
  • Enhance standard photogrammetry workflow
    • Achieve scene re-rendering with modified lights, objects or materials

7

8 of 36

Overview

8

Input

Image-space Prediction

Texture-space Gathering

9 of 36

Related Works

  • Reflectance capture
    • Specialized setup

  • Flash-photography with planar patch
    • For texture-like materials

9

[Henzler et al. 2021]

Input

Normals

Diffuse

Roughness

Specular

Relighting

[Ghosh et al. 2010]

[Aittala et al. 2013]

10 of 36

Related Works

  • Optimization-based material estimation
    • Object scale
    • Scene scale

10

[Luan et al. 2021]

Relighting

Input

Initial Mesh

Resynthesis

[Nimier-David et al. 2021]

11 of 36

Related Works

  • Inverse Rendering
  • Photogrammetry

11

Input Image

Albedo

Normal

Depth

Roughness

[Li et al. 2020]

12 of 36

Overview

12

Input

Image-space Prediction

Texture-space Gathering

13 of 36

Input

Images

Re-topologized Mesh

13

14 of 36

Overview

14

Image-space Prediction

Texture-space Gathering

15 of 36

Multi-view Colour Statistics

  • For each surface point p, select top-12 normal-aligned views

15

16 of 36

Multi-view Colour Statistics

  • Compute colour statistics, i.e. median, maximum, and minimum per-pixel corresponding to a single surface point across all views

16

Image

Median

Maximum

Minimum

17 of 36

Material Prediction Network

17

18 of 36

Dataset

18

Environments

Office

Bathroom

Living room

Kitchen

Config 1

Config 2

Config 1

Config 2

Materials x10

Materials x10

Indoor Day

Indoor EnvMap

Viewpoints x40

Viewpoints x40

160x40 views = 6400

80x2 lights = 160

8x10 materials = 80

4x2 configs = 8

4 environments = 4

6400 renders from 160 unique indoor scenes

19 of 36

Dataset

19

20 of 36

Overview

20

Texture-space Gathering

21 of 36

Material Texture Atlas Generation

  • Median filter on predicted maps
  • Gathers estimation in texture space
  • Unique material parameter per surface point

21

22 of 36

Overview

22

23 of 36

Results

23

24 of 36

Results

24

25 of 36

Results

25

26 of 36

Results

26

27 of 36

Results

27

Timing breakdown on test scenes in minutes

28 of 36

Comparisons

28

29 of 36

Comparisons

29

30 of 36

Comparisons

30

31 of 36

Comparisons

31

32 of 36

Limitations �& �Future Work

  • Re-synthesis
    • Texture resolution

  • Multi-view loss
    • End-to-end training

  • Improved de-lighting
    • Data augmentation

32

33 of 36

Rendering Equation [Kajiya et al. 1986]

From Wikipedia, the free encyclopedia, "Rendering equation"

Bi-directional Reflectance Distribution Function (BRDF)

34 of 36

BRDF function definition

  • BRDF representation models

34

35 of 36

GGX BRDF Model [Walter et al. 2007]

Deschaintre et al. 2018, "Single-Image SVBRDF Capture with a Rendering-Aware Deep Network"

Normal

Roughness

Diffuse Albedo

Specular Albedo

Rendered Input

36 of 36

Network Architecture

36

P0

UNetD

A0

R0

S0

N0

A0

R0

S0

G0

G0

L1

f

f

UNetS

Median

Stats

Max

I0N

I00

I01

D0

N0