CPSC 4070/6070:
Applied Computer Vision
Spring 2024��Lecture 3.2�Image Filtering
Siyu Huang
School of Computing
Image Credit: Pixar Animation Studios
*Acknowledgement: Many of the following slides are from Ioannis Gkioulekas and James Tompkin, who also adopted some of them from other great people.
CSCI 1430: COMPUTER VISION
This lecture
Image Credit: Pixar Animation Studios
Filtering
An operation that modifies a (measured) signal.
Definition
CSCI 1430: COMPUTER VISION
1D Filtering: Moving Average
Window size ‘k’
CSCI 1430: COMPUTER VISION
2D Filtering
Compute function of local neighborhood �at each position:
Window size ‘k’ and ‘l’
CSCI 1430: COMPUTER VISION
Smoothing with Box Filter
David Lowe
1
1
1
1
1
1
1
1
1
CSCI 1430: COMPUTER VISION
Gaussian Filter
x
y
Viewed from top
x
y
CSCI 1430: COMPUTER VISION
Gaussian filter
Parameter σ is the “scale” / “width” / “spread” of the Gaussian kernel, and controls the amount of smoothing.
= 30 pixels
= 1 pixel
= 5 pixels
= 10 pixels
CSCI 1430: COMPUTER VISION
Box Filter
1
1
1
1
1
1
1
1
1
Gaussian Filter
CSCI 1430: COMPUTER VISION
2. Practice with linear filters
0
0
0
1
0
0
0
0
0
Original
Shifted left
By 1 pixel
CSCI 1430: COMPUTER VISION
3. Practice with linear filters
-1
0
1
-2
0
2
-1
0
1
Vertical Edge
(absolute value)
Sobel
CSCI 1430: COMPUTER VISION
4. Practice with linear filters
Original
1
1
1
1
1
1
1
1
1
0
0
0
0
2
0
0
0
0
-
Sharpening filter:
Amplifies differences with local average
CSCI 1430: COMPUTER VISION
Correlation and Convolution
2D correlation
2D convolution
Convolution is the same as correlation with a 180° rotated filter kernel.
Correlation and convolution are identical when the filter kernel is rotationally symmetric*.
e.g., h = scipy.signal.correlate2d(f,I)
e.g., h = scipy.signal.convolve2d(f,I)
* Symmetric in the geometric sense, not in the matrix linear algebra sense.
Definition
CSCI 1430: COMPUTER VISION
Convolution Properties
Commutative: a * b = b * a
Associative: a * (b * c) = (a * b) * c
Distributes over addition: a * (b + c) = (a * b) + (a * c)
Scalars factor out: ka * b = a * kb = k (a * b)
Identity: a * e = a when e = [0, 0, 1, 0, 0],
CSCI 1430: COMPUTER VISION
Separability of 2D Convolution
Kristen Grauman
*
*
=
=
2D convolution�(center location only)
The filter factors�into a product of 1D�filters:
Perform convolution�along rows:
Followed by convolution�along the remaining column:
=
CSCI 1430: COMPUTER VISION
Separability of 2D Convolution
MxN image, PxQ filter
CSCI 1430: COMPUTER VISION
Separability of 2D Convolution
MxN image, PxQ filter
Speed up = PQ/(P+Q)
9x9 filter = ~4.5x faster
CSCI 1430: COMPUTER VISION
This lecture
pyramids
CSCI 1430: COMPUTER VISION
Tilt-shift photography
Tilt shift camera
Sensor
Sensor
Shift
Tilt
Can we fake tilt shift?
We need to blur the image
Questions (Bonus Point)
1) _ = D * B
2) A = _ * _
3) _ = D * D
A
B
C
D
G
H
I
* = Convolution operator
CSCI 1430: COMPUTER VISION
A = B * C
When the filter ‘looks like’ the image = ‘template matching’
A
B
C
CSCI 1430: COMPUTER VISION
A = B * C
When the filter ‘looks like’ the image = ‘template matching’
A
B
C
C is a Gaussian kernel and we know that it ‘blurs’.
CSCI 1430: COMPUTER VISION
A = B * C
When the filter ‘looks like’ the image = ‘template matching’
Filtering viewed as comparing an image of �what you want to find against all image regions.
A
B
C
C is a Gaussian kernel and we know that it ‘blurs’.
CSCI 1430: COMPUTER VISION
A = B * C
When the filter ‘looks like’ the image = ‘template matching’
Filtering viewed as comparing an image of �what you want to find against all image regions.
For symmetric filters: use either convolution or correlation.
For nonsymmetric filters: correlation is template matching. Why?
A
B
C
C is a Gaussian kernel and we know that it ‘blurs’.
CSCI 1430: COMPUTER VISION
Robert Collins
D (275 x 175 pixels)
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
D (275 x 175 pixels)
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
f
61 x 61
D (275 x 175 pixels)
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
Expect response ‘peak’ in middle of I
f
61 x 61
D (275 x 175 pixels)
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
Expect response ‘peak’ in middle of I
>> I = correlate2d( D, f, ‘same’ )
f
61 x 61
D (275 x 175 pixels)
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
Expect response ‘peak’ in middle of I
>> I = correlate2d( D, f, ‘same’ )
f
61 x 61
D (275 x 175 pixels)
I
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
Expect response ‘peak’ in middle of I
>> I = correlate2d( D, f, ‘same’ )
f
61 x 61
D (275 x 175 pixels)
I
+
Correct location
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
Expect response ‘peak’ in middle of I
>> I = correlate2d( D, f, ‘same’ )
f
61 x 61
D (275 x 175 pixels)
I
Response peak
+
Correct location
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Robert Collins
>> f = D[ 57:117, 107:167 ]
Expect response ‘peak’ in middle of I
>> I = correlate2d( D, f, ‘same’ )
f
61 x 61
D (275 x 175 pixels)
I
Response peak
Hmm…�That didn’t work – why not?
+
Correct location
Let’s see if we can use correlation to ‘find’ the parts of the image that look like the kernel.
CSCI 1430: COMPUTER VISION
Correlation
e.g., h = scipy.signal.correlate2d(f,I)
CSCI 1430: COMPUTER VISION
Correlation
e.g., h = scipy.signal.correlate2d(f,I)
As brightness in I increases, the response in h will increase, as long as f is positive.
Overall brighter regions will give higher correlation response -> not useful!
CSCI 1430: COMPUTER VISION
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
f2
61 x 61
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
f2
61 x 61
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
f2
61 x 61
D2 (275 x 175 pixels)
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
�Now zero centered. �Score is higher only when dark parts�match and when light parts match.
f2
61 x 61
D2 (275 x 175 pixels)
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
�Now zero centered. �Score is higher only when dark parts�match and when light parts match.
>> I2 = correlate2d( D2, f2, ‘same’ )
f2
61 x 61
D2 (275 x 175 pixels)
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
�Now zero centered. �Score is higher only when dark parts�match and when light parts match.
>> I2 = correlate2d( D2, f2, ‘same’ )
f2
61 x 61
D2 (275 x 175 pixels)
I2
OK, so let’s subtract the mean
CSCI 1430: COMPUTER VISION
What happens with convolution?
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
What happens with convolution?
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
What happens with convolution?
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
f2
61 x 61
What happens with convolution?
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
f2
61 x 61
What happens with convolution?
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
f2
61 x 61
D2 (275 x 175 pixels)
What happens with convolution?
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
�>> I2 = convolve2d( D2, f2, ‘same’ )
f2
61 x 61
D2 (275 x 175 pixels)
What happens with convolution?
CSCI 1430: COMPUTER VISION
>> f = D[ 57:117, 107:167 ]
>> f2 = f – np.mean(f)
>> D2 = D – np.mean(D)
�>> I2 = convolve2d( D2, f2, ‘same’ )
f2
61 x 61
D2 (275 x 175 pixels)
I2
What happens with convolution?
CSCI 1430: COMPUTER VISION
Non-Linear Filter: Median Filter
CSCI 1430: COMPUTER VISION
Non-Linear Filter: Median Filter
CSCI 1430: COMPUTER VISION
Median filter
CSCI 1430: COMPUTER VISION
Denoising with median filter
Salt and pepper noise
Median filtered
Plots of a col of the image
CSCI 1430: COMPUTER VISION
Median filters
CSCI 1430: COMPUTER VISION
Gaussian image pyramid
CSCI 1430: COMPUTER VISION
Constructing a Gaussian pyramid
filter
sample
filter
sample
repeat:
filter
subsample
until min resolution reached
Algorithm
sample
CSCI 1430: COMPUTER VISION
Some properties of the Gaussian pyramid
What happens to the details of the image?
CSCI 1430: COMPUTER VISION
Some properties of the Gaussian pyramid
What happens to the details of the image?
What is preserved at the higher levels?
CSCI 1430: COMPUTER VISION
Some properties of the Gaussian pyramid
What happens to the details of the image?
What is preserved at the higher levels?
How would you reconstruct the original image from the image at the upper level?
CSCI 1430: COMPUTER VISION
Some properties of the Gaussian pyramid
What happens to the details of the image?
What is preserved at the higher levels?
How would you reconstruct the original image from the image at the upper level?
CSCI 1430: COMPUTER VISION
Blurring is lossy
-
=
level 0
residual
What does the residual look like?
level 1 (before downsampling)
CSCI 1430: COMPUTER VISION
Blurring is lossy
-
=
level 0
level 1 (before downsampling)
residual
Can we make a pyramid that is lossless?
CSCI 1430: COMPUTER VISION
Laplacian image pyramid
CSCI 1430: COMPUTER VISION
Laplacian image pyramid
At each level, retain the residuals instead of the blurred images themselves.
Can we reconstruct the original image using the pyramid?
CSCI 1430: COMPUTER VISION
Laplacian image pyramid
At each level, retain the residuals instead of the blurred images themselves.
Can we reconstruct the original image using the pyramid?
CSCI 1430: COMPUTER VISION
Let’s start by looking at just one level
=
+
level 0
residual
Does this mean we need to store both residuals and the blurred copies of the original?
level 1 (upsampled)
CSCI 1430: COMPUTER VISION
Constructing a Laplacian pyramid
repeat:
filter
subsample
until min resolution reached
Algorithm
compute residual
CSCI 1430: COMPUTER VISION
Constructing a Laplacian pyramid
repeat:
filter
subsample
until min resolution reached
Algorithm
compute residual
What is this part?
CSCI 1430: COMPUTER VISION
Constructing a Laplacian pyramid
repeat:
filter
subsample
until min resolution reached
Algorithm
compute residual
It’s a Gaussian pyramid.
CSCI 1430: COMPUTER VISION
Reconstructing the original image
repeat:
upsample
sum with residual
until orig resolution reached
Algorithm
CSCI 1430: COMPUTER VISION
Gaussian vs Laplacian Pyramid
Shown in opposite order for space.
Which one takes
more space to store?
CSCI 1430: COMPUTER VISION
Why is it called a Laplacian pyramid?
CSCI 1430: COMPUTER VISION
Why is it called a Laplacian pyramid?
-
=
-
unit
Gaussian
Laplacian
Difference of Gaussians approximates the Laplacian
CSCI 1430: COMPUTER VISION
Other types of pyramids
Steerable pyramid: At each level keep multiple versions, one for each direction.
Wavelets: Huge area in image processing
CSCI 1430: COMPUTER VISION
What are image pyramids used for?
image blending
multi-scale
texture mapping
focal stack compositing
denoising
multi-scale detection
multi-scale registration
image compression
CSCI 1430: COMPUTER VISION
Still used extensively
CSCI 1430: COMPUTER VISION
Coming up
CSCI 1430: COMPUTER VISION