1 of 26

Sensor-independent illumination estimation for DNN models

Mahmoud Afifi1 and Michael S. Brown1,2

1York University 2Samsung AI Center - Toronto

2 of 26

Onboard camera processing

When we capture a photograph,

a number of steps are applied�onboard the camera to produce�the final sRGB output image.

Figure from Karaimer and Brown [ECCV 2016]

raw-RGB

sRGB

3 of 26

A key routine in the pipeline is white balance

When we capture a photograph, a number of steps are applied�onboard the camera to produce�the final sRGB output image.

raw-RGB

sRGB

4 of 26

White balance is called computational colour constancy�in computer vision

White balance is applied to remove the colour cast caused by the scene illumination.

5 of 26

Illumination estimation

raw-RGB image

 

 

 

“White-balanced”

raw-RGB image

 

6 of 26

Illumination estimation

Illumination�estimation algorithm

 

raw-RGB image

“White-balanced”

raw-RGB image

 

7 of 26

White balance is applied to raw-RGB

When we capture a photograph.

A number of steps are applied�onboard the camera to produce�the final sRGB output image.

raw-RGB

sRGB

8 of 26

Illumination estimation methods

Two main approaches to illumination estimation

Sensor-independent approaches

Rely on image colour statistics.

Grey-world [1980]

White-patch [1986]

Shades-of-grey [2004]

Grey-edge [2007]

PCA [2010]

Sensor-dependent approaches

Learning-based approaches via training data.

Gamut-based [1990]

Bayesian methods [2008]

Bias-correction [2013]

Decision-trees [2014]

DNNs

[2015 - foreseeable future]

9 of 26

DNN-based approaches

Sensor X raw-RGB image

White-balanced image

DNN model

for sensor X

Sensor Y raw-RGB image

 

DNN models are trained per sensor.

White-balanced image

DNN model

for sensor Y

 

10 of 26

DNN-based approaches

Sensor

spectral sensitivity

Wavelength

Sensitivity

Wavelength

Sensitivity

Sensor

spectral sensitivity

R

G

B

R

G

B

Canon 600D raw-RGB image

Nikon D5200 raw-RGB image

 

 

Images from NUS-8 illumination estimation dataset

11 of 26

Our proposed method

Sensor X raw-RGB image

Sensor Y raw-RGB image

White-balanced image

White-balanced image

Sensor-independent framework

(1) Sensor-mapping�network

(2) Illumination estimation

network

12 of 26

Overall framework and network architecture

 

 

Image mapping matrix

M

conv/ReLU

128 5x5 with

stride = 2

fc

9

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Sensor mapping network

RGB-uv histogram block

RGB-uv histogram for sensor mapping

 

Mapping input image from original raw

space to the learned working space

RGB-uv histogram block

RGB-uv histogram for illuminant estimation

fc

3

conv/ReLU

128 5x5 with

stride = 2

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Illuminant estimation network

 

 

 

 

White-balanced image

 

 

Mapping to original

raw space

 

M-1

 

Final

estimated

illuminant

 

 

 

 

13 of 26

Starting point: RGB–uv histogram block

We build on the RGB-uv histogram feature from [1,2] by adding two learnable parameters to control the contribution of each color channel in the generated histogram and the smoothness of histogram bins.

[1] J. Barron, Convolutional Color Constancy, ICCV, 2015.

[2] M. Afifi et al., When Color Constancy Goes Wrong, CVPR, 2019.

Input raw-RGB image

Generated RGB-uv

histogram

14 of 26

Sensor space mapping network

conv/ReLU

128 5x5 with

stride = 2

fc

9

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

 

 

Input image histogram

 

 

15 of 26

Illumination estimation network

Mapped image histogram

 

 

 

 

Mapped raw-RGB image

conv/ReLU

128 5x5 with

stride = 2

fc

3

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

16 of 26

Training and loss function

  •  

R

G

B

Estimated

GT

17 of 26

Training and loss function

 

 

 

RGB-uv histogram block

fc

3

conv/ReLU

128 5x5 with

stride = 2

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Illuminant estimation network

RGB-uv histogram for illuminant estimation

RGB-uv histogram block

conv/ReLU

128 5x5 with

stride = 2

fc

9

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Sensor mapping network

RGB-uv histogram for sensor mapping

 

 

 

 

Mapping input image from original raw

space to the learned working space

 

 

 

Mapping to original

raw space

 

M-1

Angular �Error Loss

 

 

 

Angular error Loss =

Ground Truth

 

 

18 of 26

Training and loss function

 

 

 

RGB-uv histogram block

fc

3

conv/ReLU

128 5x5 with

stride = 2

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Illuminant estimation network

RGB-uv histogram for illuminant estimation

RGB-uv histogram block

conv/ReLU

128 5x5 with

stride = 2

fc

9

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Sensor mapping network

RGB-uv histogram for sensor mapping

 

 

 

 

Mapping input image from original raw

space to the learned working space

 

 

 

Mapping to original

raw space

 

M-1

Angular �Error Loss

 

 

 

Angular error Loss =

Ground Truth

 

 

19 of 26

Test time example #1

RGB-uv histogram block

RGB-uv histogram block

Illuminant estimation net

 

Estimated illuminant in the working space

 

Mapping matrix

Sensor mapping

net

Testing raw-RGB image

(Fujifilm XM1)

 

Project back to the original raw-RGB space

White-balanced raw-RGB�

 

"working space"

20 of 26

Test time example #2

RGB-uv histogram block

RGB-uv histogram block

Testing raw-RGB image

(Canon 1Ds Mk III)

Mapping matrix

Sensor mapping

network

 

Illuminant estimation net

Estimated illuminant in the working space

 

 

"working space"

Project back to the original raw-RGB space

White-balanced raw-RGB�

 

21 of 26

Experimental results

  • All camera models in:
    • NUS 8-Camera dataset (8 camera models)
    • Gehler-Shi dataset (2 camera models)
    • Cube/Cube+ dataset (1 camera model)

NUS-8 dataset

Gehler-Shi dataset

Cube/Cube+ datasets

22 of 26

Results

Methods

NUS 8-Camera

Gehler-Shi

Mean

Median

Best 25%

Worst 25%

Mean

Median

Best 25%

Worst 25%

Avg. sensor-independent methods

4.26

3.25

0.99

9.43

5.10

4.03

1.91

10.77

Avg. sensor-dependent methods

2.40

1.64

0.50

5.75

2.62

1.75

0.50

5.95

Ours

2.05

1.50

0.52

4.48

2.77

1.93

0.55

6.53

Angular errors

Sensor-independent methods

- J. Van De Weijer, et al., Edge-based color constancy, TIP, 2007

- S. Bianco and C. Cusano, Quasi-unsupervised color constancy, �In CVPR’19

- Y. Qian, et al., On finding gray pixels, In CVPR’19

Sensor-dependent methods

- W. Shi, et al., Deep specialized network for illuminant estimation, In ECCV’16

- . J. T. Barron and Y.-T. Tsai, Fast Fourier color constancy,

In CVPR’17

- Y. Hu, et al., FC4: Fully convolutional color constancy with confidence-weighted pooling, In CVPR’17

23 of 26

Results

Methods

Cube

Cube+

Mean

Median

Best 25%

Worst 25%

Mean

Median

Best 25%

Worst 25%

Avg. sensor-independent methods

3.57

2.47

0.64

8.30

4.98

3.32

0.82

11.77

Avg. sensor-dependent methods

1.54

0.92

0.26

3.85

2.04

1.02

0.25

5.58

Ours

1.98

1.36

0.40

4.64

2.14

1.44

0.44

5.06

Sensor-independent methods

- J. Van De Weijer, et al., Edge-based color constancy, TIP, 2007

- S. Bianco and C. Cusano, Quasi-unsupervised color constancy, �In CVPR’19

- Y. Qian, et al., On finding gray pixels, In CVPR’19

Sensor-dependent methods

- W. Shi, et al., Deep specialized network for illuminant estimation, In ECCV’16

- . J. T. Barron and Y.-T. Tsai, Fast Fourier color constancy,

In CVPR’17

- Y. Hu, et al., FC4: Fully convolutional color constancy with confidence-weighted pooling, In CVPR’17

Angular errors

24 of 26

Results

Angular error = 1.29°

Angular error = 0.43°

Angular error = 0.98°

Samsung NX

Canon 600D

Canon 5D

Input raw-RGB images

Mapped images

Corrected images

Ground truth

Images from NUS 8-Cameras and Gehler-Shi datasets

25 of 26

Summary

  • Presented a deep learning framework for sensor-independent illumination estimation

  • Our method learns an image-specific mapping to transform different camera raw-RGB images into a canonical "working space"

  • Our framework is on par with sensor-dependent DNNs, but requires only a single DNN.

26 of 26

Thank you

 

 

Image mapping matrix

M

RGB-uv histogram block

conv/ReLU

128 5x5 with

stride = 2

fc

9

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Sensor mapping network

RGB-uv histogram for sensor mapping

 

Mapping input image from original raw

space to the learned working space

RGB-uv histogram block

fc

3

conv/ReLU

128 5x5 with

stride = 2

conv/ReLU

256 3x3 with

stride = 2

conv/ReLU

512 2x2 with

stride = 1

Illuminant estimation network

RGB-uv histogram for illuminant estimation

 

 

 

 

White-balanced image

 

 

Mapping to original

raw space

 

M-1

 

Final

estimated

illuminant

 

 

 

Questions?