CSE 5524: �A Simple Vision System (Cont.)�& Image Formation
Course information, grading, reading, policy, etc.
https://sites.google.com/view/osu-cse-5524-sp25-chao/home
2
Today
3
Three representation computer vision sub-fields
4
S: scene
I: image
2: Reconstruction
1: Recognition
tree
3: Generation
tree
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
A simple world: the blocks world
What are inside?
Image formation assumptions
5
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
[Figure credit: https://www.geeksforgeeks.org/parallel-othographic-oblique-projection-in-computer-graphics]
Our goal: recover the world coordinate of all pixels
We want to know X(x, y), Y(x, y), and Z(x, y) from the given image!
What we know:
We need some cues from images and the 3D world!
6
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Our goal: recover the world coordinate of all pixels
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
(x, y)
Reconstructed 3D worlds from other views
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Reconstructed 3D worlds from other views
Depth estimation and 3D reconstruction
Questions?
Before we dive into details, let’s take a step back
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Can you infer the 3D information from 2D?
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Can you write down what you just said by math/code?
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Cue 1: edges
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Cue 2: Surfaces & Cue 3: properties from 3D to 2D
Our goal: recover the world coordinate of all pixels
We want to know X(x, y), Y(x, y), and Z(x, y) from the given image!
16
If we know Y(x, y), we know Z(x, y)
Our goal: recover the world coordinate of all pixels
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Questions?
How to represent the “3D height map” Y?
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
=
How to represent the “3D height map” Y?
0 | 0 | 124 | 255 | 125 |
0 | 0 | 125 | 126 | 60 |
0 | 0 | 126 | 60 | 126 |
0 | 0 | 0 | 127 | 60 |
0 | 0 | 0 | 0 | 128 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
124 |
125 |
126 |
0 |
0 |
255 |
126 |
60 |
127 |
0 |
125 |
60 |
126 |
60 |
128 |
How to estimate the “3D height map” Y?
Reconstruction
Cues encoded by linear equations
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
Estimating Y(x, y) from the input image
Y
Estimating Y(x, y) from the input image
Y
Estimating Y(x, y) from the input image
Y
Estimating Y(x, y) from the input image
Y
Horizontal edges: Y won’t change along the edge
= 0
Estimating Y(x, y) from the input image – horizontal edges
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Estimating Y(x, y) from the input image
Y
Horizontal edges: Y won’t change along the edge
= 0
Estimating Y(x, y) from the input image
Y
Surfaces: flat, not curved
Questions?
Information propagation via “optimization”
= 0
Information propagation
Information propagation
Information propagation via “optimization”
= 0
Information propagation via “optimization”
Least square solution!
Results
Reconstructed 3D worlds from other views
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Caution
Reading & keywords
HW1
Questions?
What is 3D reconstruction nowadays?
What is 3D reconstruction nowadays?
What is 3D reconstruction nowadays?
How to let computers recognize objects?
A cat?
A lion?
A car?
Percept:
See a picture
Action:
Tell the object class
Human design vs. machine-learning-based
cat
Design
cat
cat
cat
Data
collection
“Learn”
“Coding” the rules:
Can you list the rules of recognizing a cat?
Underlying idea:
Humans sometimes are good at “making decisions” BUT are not good at “explaining decisions”.
Today
47
Goal
“Visible” light interacting with surfaces
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Power (wavelength)
Bidirectional reflection distribution function
Lambertian surfaces
Specular surfaces
Why are these models important?
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Two light sources
From lights to world interpretation
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Questions?
Images & cameras
Projection surface
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Examples of pinhole cameras
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Does the distance between the projection surface and the pinhole matter?
The world is full of accidental cameras
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Image formation by perspective projection
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Image coordinates vs.
virtual camera coordinates?
Perspective projection equations
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Orthographic (parallel) projection equations
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Good for the telephoto lenses
Can we really have orthographic projection?
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Questions?
What’s wrong with pinhole cameras
Projection surface
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Images are dime …
Limited lights ...
From pinholes to lenses
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Light needs to be concentrated/ bent!
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Lensmaker’s formula
Snell’s law
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
A lens
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
Simplified optical system
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
General cases
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]
[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]