1 of 75

CSE 5524: �A Simple Vision System (Cont.)�& Image Formation

2 of 75

Course information, grading, reading, policy, etc.

  • Please check slide decks 1 & 2, course website, Carmen, and syllabus.

  • Course website:

https://sites.google.com/view/osu-cse-5524-sp25-chao/home

  • Office hours start this week!
    • Dr. Chao (DL587): Tuesday 3 – 4 pm & Friday 9 – 10 am
    • Zheda Mai (BE 406): Monday 11 am - 12 pm & Wednesday 2 pm - 3 pm

  • Linear algebra quizzes: released last week, due 9/16

2

3 of 75

Today

  • Recap and continuation: a simple vision system
  • Image formation

3

4 of 75

Three representation computer vision sub-fields

4

S: scene

I: image

2: Reconstruction

1: Recognition

tree

3: Generation

tree

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

5 of 75

A simple world: the blocks world

What are inside?

  • Simple but varied set of objects
  • Flat horizontal or vertical surfaces
  • White horizontal ground plane

Image formation assumptions

  • Parallel (orthographic) projection

5

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

[Figure credit: https://www.geeksforgeeks.org/parallel-othographic-oblique-projection-in-computer-graphics]

6 of 75

Our goal: recover the world coordinate of all pixels

We want to know X(x, y), Y(x, y), and Z(x, y) from the given image!

What we know:

We need some cues from images and the 3D world!

6

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

7 of 75

Our goal: recover the world coordinate of all pixels

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

(x, y)

8 of 75

Reconstructed 3D worlds from other views

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

9 of 75

Reconstructed 3D worlds from other views

Depth estimation and 3D reconstruction

10 of 75

Questions?

11 of 75

Before we dive into details, let’s take a step back

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

12 of 75

Can you infer the 3D information from 2D?

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

13 of 75

Can you write down what you just said by math/code?

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

14 of 75

Cue 1: edges

  • Edges: image regions with strong color/intensity changes w.r.t. location

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

15 of 75

Cue 2: Surfaces & Cue 3: properties from 3D to 2D

  • Separate into foregrounds (figures)/backgrounds

  • Not always true, but let’s assume it is true
    • Vertical in 3D will project to vertical in 2D; thus, vertical in 2D mean vertical in 3D
    • Non-vertical in 2D means horizontal in 3D

16 of 75

Our goal: recover the world coordinate of all pixels

We want to know X(x, y), Y(x, y), and Z(x, y) from the given image!

16

If we know Y(x, y), we know Z(x, y)

17 of 75

Our goal: recover the world coordinate of all pixels

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

18 of 75

Questions?

19 of 75

How to represent the “3D height map” Y?

 

=

20 of 75

How to represent the “3D height map” Y?

  • Vectorization: matrix to vector

0

0

124

255

125

0

0

125

126

60

0

0

126

60

126

0

0

0

127

60

0

0

0

0

128

0

0

0

0

0

0

0

0

0

0

124

125

126

0

0

255

126

60

127

0

125

60

126

60

128

21 of 75

How to estimate the “3D height map” Y?

Reconstruction

  • Known equations
  • Cues from edges, surfaces, and 2D/3D relationships

22 of 75

Cues encoded by linear equations

 

23 of 75

Estimating Y(x, y) from the input image

Y

 

 

24 of 75

Estimating Y(x, y) from the input image

Y

 

 

25 of 75

Estimating Y(x, y) from the input image

Y

 

26 of 75

Estimating Y(x, y) from the input image

Y

Horizontal edges: Y won’t change along the edge

= 0

27 of 75

Estimating Y(x, y) from the input image – horizontal edges

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

 

 

28 of 75

Estimating Y(x, y) from the input image

Y

Horizontal edges: Y won’t change along the edge

= 0

 

29 of 75

Estimating Y(x, y) from the input image

Y

Surfaces: flat, not curved

 

30 of 75

Questions?

31 of 75

Information propagation via “optimization”

  • All the information:
    • 3D vertical edge:
    • 3D horizontal edge:
    • Flat surfaces:

    • Contact edges & background: Y = 0

= 0

32 of 75

Information propagation

33 of 75

Information propagation

34 of 75

Information propagation via “optimization”

  • All the information:
    • 3D vertical edge:
    • 3D horizontal edge:
    • Flat surfaces:

    • Contact edges & background: Y = 0
  • They can be rewritten as an overdetermined system of linear equations

= 0

35 of 75

Information propagation via “optimization”

 

Least square solution!

36 of 75

Results

37 of 75

Reconstructed 3D worlds from other views

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

38 of 75

Caution

  • The approach we developed so far does not always work.
  • We have made lots of assumptions.

  • Still, it gives some sense of the scope and challenge of computer vision.

39 of 75

Reading & keywords

  • Chapter 2

  • Image intensity
  • Edge, shadow edges
  • Surface
  • Linear system, Overdetermined system of linear equations
  • Least square solutions
  • 3D reconstruction
  • Parallel and perspective projection

40 of 75

HW1

  • Release tonight or Thursday night!

41 of 75

Questions?

42 of 75

What is 3D reconstruction nowadays?

43 of 75

What is 3D reconstruction nowadays?

44 of 75

What is 3D reconstruction nowadays?

45 of 75

How to let computers recognize objects?

A cat?

A lion?

A car?

Percept:

See a picture

Action:

Tell the object class

46 of 75

Human design vs. machine-learning-based

cat

Design

cat

cat

cat

Data

collection

“Learn”

“Coding” the rules:

Can you list the rules of recognizing a cat?

Underlying idea:

Humans sometimes are good at “making decisions” BUT are not good at “explaining decisions”.

47 of 75

Today

  • Recap and continuation: a simple vision system
  • Image formation

47

48 of 75

Goal

  • How are images formed?
  • How can light illuminating the space be captured by a device to form a picture?

49 of 75

“Visible” light interacting with surfaces

  • Light:
    • Wave (with wavelength, frequency)
    • Light ray – specified by position, direction, and intensity, as a function of wavelength and polarization

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

Power (wavelength)

Bidirectional reflection distribution function

50 of 75

Lambertian surfaces

  • Bidirectional reflection distribution functions (BRDFs) can be complex

  • Assumptions: Lambertian model

  • Lambertian model: the outgoing ray intensity is a function of
    • Incoming light power
    • Wavelength
    • Surface orientation relative to the incoming ray directions
    • A scalar surface reflectance, aka, albedo

  • No dependency on the outgoing direction of the ray

51 of 75

Specular surfaces

  • Phong reflection model: widely used, with specular components of reflection
    • Ambient: Constant
    • Diffuse: Lambertian model
    • Specular reflection

52 of 75

Why are these models important?

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

Two light sources

53 of 75

From lights to world interpretation

  • To understand our world from the lights
    • We need to “associate” the reflected light with the surface in the world.
    • We need to know which light rays come from which direction in space.

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

54 of 75

Questions?

55 of 75

Images & cameras

  • Forming an image = identifying which rays coming from which directions

  • Camera: organizing rays

  • Pinhole camera:
    • One location on the wall
    • Light from one direction

Projection surface

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

56 of 75

Examples of pinhole cameras

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

Does the distance between the projection surface and the pinhole matter?

57 of 75

The world is full of accidental cameras

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

58 of 75

Image formation by perspective projection

  • A (pinhole) camera projects 3D coordinates in the world to 2D positions on the projection plane, through the straight-line paths of each light ray through the pinhole

59 of 75

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

Image coordinates vs.

virtual camera coordinates?

60 of 75

Perspective projection equations

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

61 of 75

Orthographic (parallel) projection equations

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

Good for the telephoto lenses

62 of 75

Can we really have orthographic projection?

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

63 of 75

Questions?

64 of 75

What’s wrong with pinhole cameras

Projection surface

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

Images are dime …

Limited lights ...

65 of 75

From pinholes to lenses

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

Light needs to be concentrated/ bent!

66 of 75

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

67 of 75

Lensmaker’s formula

  • From one material to the other, light changes its wavelength and speed

  • The changes at the surface will cause light to bend, i.e., refraction
    • Depend on the change of speed and orientation

68 of 75

Snell’s law

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

69 of 75

A lens

  • A specifically “shaped” piece of transparent material, positioned to focus light from a surface point onto a sensor

  • Ideally …

  • Need: numerical optimization!

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

70 of 75

Simplified optical system

  •  

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

71 of 75

  • Assumptions:
    • Paraxial: the angle is small
    • Thin lens: negligible thickness

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

72 of 75

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

  • Assumptions:
    • Paraxial: the angle is small
    • Thin lens: negligible thickness

73 of 75

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

  • Assumptions:
    • Paraxial: the angle is small
    • Thin lens: negligible thickness

  • Lensmaker’s formula:

74 of 75

General cases

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]

75 of 75

[Figure credit: A. Torralba, P. Isola, and W. T. Freeman, Foundations of Computer Vision.]