1 of 82

CPSC 4070/6070:

Applied Computer Vision

Spring 2024��Lecture 3.2�Image Filtering

Siyu Huang

School of Computing

Image Credit: Pixar Animation Studios

2 of 82

Filtering
Cross-correlation & Convolution

*Acknowledgement: Many of the following slides are from Ioannis Gkioulekas and James Tompkin, who also adopted some of them from other great people.

CSCI 1430: COMPUTER VISION

This lecture

Image Credit: Pixar Animation Studios

3 of 82

Filtering

Modifications include

Removing undesirable components
Transforming signal in a desirable way
Extract specific components of a signal

An operation that modifies a (measured) signal.

Definition

CSCI 1430: COMPUTER VISION

4 of 82

1D Filtering: Moving Average

Window size ‘k’

CSCI 1430: COMPUTER VISION

5 of 82

2D Filtering

Compute function of local neighborhood �at each position:

Window size ‘k’ and ‘l’

CSCI 1430: COMPUTER VISION

6 of 82

Smoothing with Box Filter

David Lowe

1

CSCI 1430: COMPUTER VISION

7 of 82

Gaussian Filter

x

y

Viewed from top

x

y

CSCI 1430: COMPUTER VISION

8 of 82

Gaussian filter

Parameter σ is the “scale” / “width” / “spread” of the Gaussian kernel, and controls the amount of smoothing.

= 30 pixels

= 1 pixel

= 5 pixels

= 10 pixels

CSCI 1430: COMPUTER VISION

9 of 82

Box Filter

1

Gaussian Filter

CSCI 1430: COMPUTER VISION

10 of 82

2. Practice with linear filters

0

1

0

Original

Shifted left

By 1 pixel

CSCI 1430: COMPUTER VISION

11 of 82

3. Practice with linear filters

-1

0

1

-2

0

2

-1

0

1

Vertical Edge

(absolute value)

Sobel

CSCI 1430: COMPUTER VISION

12 of 82

4. Practice with linear filters

Original

1

0

2

0

-

Sharpening filter:

Amplifies differences with local average

CSCI 1430: COMPUTER VISION

13 of 82

Correlation and Convolution

2D correlation

2D convolution

Convolution is the same as correlation with a 180° rotated filter kernel.

Correlation and convolution are identical when the filter kernel is rotationally symmetric*.

e.g., h = scipy.signal.correlate2d(f,I)

e.g., h = scipy.signal.convolve2d(f,I)

* Symmetric in the geometric sense, not in the matrix linear algebra sense.

Definition

CSCI 1430: COMPUTER VISION

14 of 82

Convolution Properties

Commutative: a * b = b * a

Conceptually no difference between filter and signal
Correlation is _not_ commutative (rotation effect) – produces rotated version of output.

Associative: a * (b * c) = (a * b) * c

Often apply several filters one after another: (((a * b₁) * b₂) * b₃)
This is equivalent to applying one filter: a * (b₁ * b₂ * b₃) -> computationally faster
Correlation is _not_ associative (rotation effect)

Distributes over addition: a * (b + c) = (a * b) + (a * c)

Scalars factor out: ka * b = a * kb = k (a * b)

Identity: a * e = a when e = [0, 0, 1, 0, 0],

CSCI 1430: COMPUTER VISION

15 of 82

Separability of 2D Convolution

Kristen Grauman

*

=

2D convolution�(center location only)

The filter factors�into a product of 1D�filters:

Perform convolution�along rows:

Followed by convolution�along the remaining column:

=

CSCI 1430: COMPUTER VISION

16 of 82

Separability of 2D Convolution

MxN image, PxQ filter

2D convolution: ~MNPQmultiply-adds
Separable 2D: ~MN(P+Q) multiply-adds

CSCI 1430: COMPUTER VISION

17 of 82

Separability of 2D Convolution

MxN image, PxQ filter

2D convolution: ~MNPQmultiply-adds
Separable 2D: ~MN(P+Q) multiply-adds

Speed up = PQ/(P+Q)

9x9 filter = ~4.5x faster

CSCI 1430: COMPUTER VISION

18 of 82

This lecture

More filters
Tilt-shift
Template matching
Laplacian image

pyramids

CSCI 1430: COMPUTER VISION

19 of 82

Tilt-shift photography

20 of 82

Tilt shift camera

Sensor

Shift

Tilt

21 of 82

Can we fake tilt shift?

We need to blur the image

OK, now we know how to do that by image filtering

22 of 82

Questions (Bonus Point)

1) _ = D * B

2) A = _ * _

3) _ = D * D

A

B

C

D

G

H

I

* = Convolution operator

CSCI 1430: COMPUTER VISION

23 of 82

A = B * C

When the filter ‘looks like’ the image = ‘template matching’

�

A

B

C

�

CSCI 1430: COMPUTER VISION

24 of 82

A = B * C

When the filter ‘looks like’ the image = ‘template matching’

A

B

C

C is a Gaussian kernel and we know that it ‘blurs’.

CSCI 1430: COMPUTER VISION

25 of 82

A = B * C

When the filter ‘looks like’ the image = ‘template matching’

Filtering viewed as comparing an image of �what you want to find against all image regions.

A

B

C

C is a Gaussian kernel and we know that it ‘blurs’.

CSCI 1430: COMPUTER VISION

26 of 82

A = B * C

When the filter ‘looks like’ the image = ‘template matching’

Filtering viewed as comparing an image of �what you want to find against all image regions.

For symmetric filters: use either convolution or correlation.

For nonsymmetric filters: correlation is template matching. Why?

A

B

C

C is a Gaussian kernel and we know that it ‘blurs’.

CSCI 1430: COMPUTER VISION

27 of 82

Robert Collins

D (275 x 175 pixels)

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

28 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

D (275 x 175 pixels)

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

29 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

f

61 x 61

D (275 x 175 pixels)

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

30 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

Expect response ‘peak’ in middle of I

f

61 x 61

D (275 x 175 pixels)

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

31 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

Expect response ‘peak’ in middle of I

>> I = correlate2d( D, f, ‘same’ )

f

61 x 61

D (275 x 175 pixels)

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

32 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

Expect response ‘peak’ in middle of I

>> I = correlate2d( D, f, ‘same’ )

f

61 x 61

D (275 x 175 pixels)

I

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

33 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

Expect response ‘peak’ in middle of I

>> I = correlate2d( D, f, ‘same’ )

f

61 x 61

D (275 x 175 pixels)

I

+

Correct location

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

34 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

Expect response ‘peak’ in middle of I

>> I = correlate2d( D, f, ‘same’ )

f

61 x 61

D (275 x 175 pixels)

I

Response peak

+

Correct location

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

35 of 82

Robert Collins

>> f = D[ 57:117, 107:167 ]

Expect response ‘peak’ in middle of I

>> I = correlate2d( D, f, ‘same’ )

f

61 x 61

D (275 x 175 pixels)

I

Response peak

Hmm…�That didn’t work – why not?

+

Correct location

Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.

CSCI 1430: COMPUTER VISION

36 of 82

Correlation

e.g., h = scipy.signal.correlate2d(f,I)

CSCI 1430: COMPUTER VISION

37 of 82

Correlation

e.g., h = scipy.signal.correlate2d(f,I)

As brightness in I increases, the response in h will increase, as long as f is positive.

Overall brighter regions will give higher correlation response -> not useful!

CSCI 1430: COMPUTER VISION

38 of 82

� � �

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

39 of 82

>> f = D[ 57:117, 107:167 ]

� � �

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

40 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

� � �

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

41 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

� � �

f2

61 x 61

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

42 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

� � �

f2

61 x 61

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

43 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

� � �

f2

61 x 61

D2 (275 x 175 pixels)

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

44 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

�Now zero centered. �Score is higher only when dark parts�match and when light parts match.

f2

61 x 61

D2 (275 x 175 pixels)

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

45 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

�Now zero centered. �Score is higher only when dark parts�match and when light parts match.

>> I2 = correlate2d( D2, f2, ‘same’ )

f2

61 x 61

D2 (275 x 175 pixels)

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

46 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

�Now zero centered. �Score is higher only when dark parts�match and when light parts match.

>> I2 = correlate2d( D2, f2, ‘same’ )

f2

61 x 61

D2 (275 x 175 pixels)

I2

OK, so let’s subtract the mean

CSCI 1430: COMPUTER VISION

47 of 82

What happens with convolution?

CSCI 1430: COMPUTER VISION

48 of 82

>> f = D[ 57:117, 107:167 ]

�

What happens with convolution?

CSCI 1430: COMPUTER VISION

49 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

�

What happens with convolution?

CSCI 1430: COMPUTER VISION

50 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

�

f2

61 x 61

What happens with convolution?

CSCI 1430: COMPUTER VISION

51 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

�

f2

61 x 61

What happens with convolution?

CSCI 1430: COMPUTER VISION

52 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

�

f2

61 x 61

D2 (275 x 175 pixels)

What happens with convolution?

CSCI 1430: COMPUTER VISION

53 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

�>> I2 = convolve2d( D2, f2, ‘same’ )

f2

61 x 61

D2 (275 x 175 pixels)

What happens with convolution?

CSCI 1430: COMPUTER VISION

54 of 82

>> f = D[ 57:117, 107:167 ]

>> f2 = f – np.mean(f)

>> D2 = D – np.mean(D)

�>> I2 = convolve2d( D2, f2, ‘same’ )

f2

61 x 61

D2 (275 x 175 pixels)

I2

What happens with convolution?

CSCI 1430: COMPUTER VISION

55 of 82

Non-Linear Filter: Median Filter

Operates over a window by selecting the median intensity in the window.

‘Rank’ filter as based on ordering of gray levels

E.G., min, max, range filters

CSCI 1430: COMPUTER VISION

56 of 82

Non-Linear Filter: Median Filter

Operates over a window by selecting the median intensity in the window.

‘Rank’ filter as based on ordering of gray levels

E.G., min, max, range filters

Non-linear filters do not satisfy the linearity or shift invariance properties of linear filters

Better at removing salt’n’pepper noise

CSCI 1430: COMPUTER VISION

57 of 82

Median filter

No new pixel values introduced
Removes spikes: good for impulse, salt & pepper noise
Non-linear filter

CSCI 1430: COMPUTER VISION

58 of 82

Denoising with median filter

Salt and pepper noise

Median filtered

Plots of a col of the image

CSCI 1430: COMPUTER VISION

59 of 82

Median filters

Is a median filter a kind of convolution?

CSCI 1430: COMPUTER VISION

60 of 82

Gaussian image pyramid

CSCI 1430: COMPUTER VISION

61 of 82

Constructing a Gaussian pyramid

filter

sample

filter

sample

repeat:

filter

subsample

until min resolution reached

Algorithm

sample

CSCI 1430: COMPUTER VISION

62 of 82

Some properties of the Gaussian pyramid

What happens to the details of the image?

CSCI 1430: COMPUTER VISION

63 of 82

Some properties of the Gaussian pyramid

What happens to the details of the image?

They get smoothed out as we move to higher levels.

What is preserved at the higher levels?

CSCI 1430: COMPUTER VISION

64 of 82

Some properties of the Gaussian pyramid

What happens to the details of the image?

They get smoothed out as we move to higher levels.

What is preserved at the higher levels?

Mostly large uniform regions in the original image.

How would you reconstruct the original image from the image at the upper level?

CSCI 1430: COMPUTER VISION

65 of 82

Some properties of the Gaussian pyramid

What happens to the details of the image?

They get smoothed out as we move to higher levels.

What is preserved at the higher levels?

Mostly large uniform regions in the original image.

How would you reconstruct the original image from the image at the upper level?

That’s not possible.

CSCI 1430: COMPUTER VISION

66 of 82

Blurring is lossy

-

=

level 0

residual

What does the residual look like?

level 1 (before downsampling)

CSCI 1430: COMPUTER VISION

67 of 82

Blurring is lossy

-

=

level 0

level 1 (before downsampling)

residual

Can we make a pyramid that is lossless?

CSCI 1430: COMPUTER VISION

68 of 82

Laplacian image pyramid

CSCI 1430: COMPUTER VISION

69 of 82

Laplacian image pyramid

At each level, retain the residuals instead of the blurred images themselves.

Can we reconstruct the original image using the pyramid?

CSCI 1430: COMPUTER VISION

70 of 82

Laplacian image pyramid

At each level, retain the residuals instead of the blurred images themselves.

Can we reconstruct the original image using the pyramid?

Yes we can!

CSCI 1430: COMPUTER VISION

71 of 82

Let’s start by looking at just one level

=

+

level 0

residual

Does this mean we need to store both residuals and the blurred copies of the original?

level 1 (upsampled)

CSCI 1430: COMPUTER VISION

72 of 82

Constructing a Laplacian pyramid

repeat:

filter

subsample

until min resolution reached

Algorithm

compute residual

CSCI 1430: COMPUTER VISION

73 of 82

Constructing a Laplacian pyramid

repeat:

filter

subsample

until min resolution reached

Algorithm

compute residual

What is this part?

CSCI 1430: COMPUTER VISION

74 of 82

Constructing a Laplacian pyramid

repeat:

filter

subsample

until min resolution reached

Algorithm

compute residual

It’s a Gaussian pyramid.

CSCI 1430: COMPUTER VISION

75 of 82

Reconstructing the original image

repeat:

upsample

sum with residual

until orig resolution reached

Algorithm

CSCI 1430: COMPUTER VISION

76 of 82

Gaussian vs Laplacian Pyramid

Shown in opposite order for space.

Which one takes

more space to store?

CSCI 1430: COMPUTER VISION

77 of 82

Why is it called a Laplacian pyramid?

CSCI 1430: COMPUTER VISION

78 of 82

Why is it called a Laplacian pyramid?

-

=

-

unit

Gaussian

Laplacian

Difference of Gaussians approximates the Laplacian

CSCI 1430: COMPUTER VISION

79 of 82

Other types of pyramids

Steerable pyramid: At each level keep multiple versions, one for each direction.

Wavelets: Huge area in image processing

CSCI 1430: COMPUTER VISION

80 of 82

What are image pyramids used for?

image blending

multi-scale

texture mapping

focal stack compositing

denoising

multi-scale detection

multi-scale registration

image compression

CSCI 1430: COMPUTER VISION

81 of 82

Still used extensively

CSCI 1430: COMPUTER VISION

82 of 82

Coming up

Image Filtering in Frequency Domain

CSCI 1430: COMPUTER VISION