Depth Perception
Monocular cues
Linear perspective
Convergence of lines
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Aerial perspective
Accommodation
Motion-based cues
Motion parallax
Optic flow
Binocular cues
Convergence
Stereopsis
Learning-based strategies
Depth Perception
Monocular cues
Interposition
Linear Perspective
Convergence of lines
Linear perspective
Relative size
Texture gradient
Depth Perception
Monocular cues
Linear perspective
Convergence of lines
Relative size
Texture gradient
Interposition
Shading and shadows
Depth Perception
Monocular cues
Linear perspective
Convergence of lines
Relative size
Texture gradient
Interposition
Shading and shadows
‘Dimples and Pimples’
Depth Perception
Monocular cues
Linear perspective
Convergence of lines
Relative size
Texture gradient
Interposition
Shading and shadows
Position of cast shadows indicates object position in depth
Ball in box
Depth Perception
Monocular cues
Linear perspective
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Jumping spiders
Depth Perception
Monocular cues
Linear perspective
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Aerial perspective
Depth Perception
Monocular cues
Linear perspective
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Aerial perspective
Accommodation
Chameleons use accommodation cues to judge distance, Nature, 1977
Depth Perception
Monocular cues
Linear perspective
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Aerial perspective
Accommodation
Binocular cues
Convergence
Depth Perception
Monocular cues
Linear perspective
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Aerial perspective
Accommodation
Binocular cues
Convergence
Stereopsis
Eyes with overlapping fields enable stereoscopic (solid) vision.
The brain measures the lateral displacement of features in the two eyes (binocular disparity) and experiences it as stereoscopic depth.
Stereopsis
"Eyes in the front, the animal hunts.
Eyes on the side, the animal hides."
Leonardo da Vinci realized that the eyes normally receive different views of a 3-D scene. Hence, he thought it impossible, even in principle, to convey a full sense of 3-D on a 2-D canvas. He puzzled over how we can see a single world of solid objects given the different eye views (now known as Leonardo’s paradox).
In 1838, English physicist Charles Wheatstone made line drawings of each eye’s view of simple objects. Then, employing a device he invented, called a mirror stereoscope, he presented these line drawings together to the viewer: left view to left eye alone; right view to right eye alone. He saw the skeletal outline of the object spring into 3-D relief! This suggested that the image differences were the basis of 3-D depth perception.
Stereopsis
Challenges:
1. Trigonometric calculations
2. Correspondence problem
How is the correspondence problem solved?
The Stereo Correspondence Problem
“During binocular regard of an objective image, each uniocular
mechanism develops independently a sensual image of considerable
completeness. The singleness of binocular perception results from
the union of these elaborated uniocular sensations. The singleness is
therefore the product of a synthesis that works with already elaborated
sensations contemporaneously proceeding.”
- Sherrington, 1906
The Stereo Correspondence Problem
Hand
Hand
Is monocular shape analysis a necessary pre-requisite for
stereo correspondence?
Bela Julesz
Is monocular shape analysis a necessary pre-requisite for
stereo correspondence?
Computational theories for solving the correspondence problem:
Given the underconstrained matching problem (100! Possible pairings in an RDS with
100 dots), what assumptions can we bring to bear?
Assumption 1: Epipolar constraint
Marr-Poggio’s network-based formulation of the problem:
Assumptions:
/ match uniqueness
Sample result of Marr-Poggio’s network:
What happens when no correspondence is possible?
Highly mismatched stereo-pairs lead to ‘binocular rivalry’
Open questions:
What is the site of binocular rivalry?
Can rivalry and fusion coexist? What does this imply regarding the site of rivalry?
Kovacs et al.
Some interesting stereo phenomena:
Pulfrich effect, described, ironically, by the famous one-eyed scientist Carl Pulfrich in 1922 (experimenting on others, of course).
Some interesting stereo phenomena:
Chromostereopsis
Some interesting stereo phenomena:
Chromostereopsis
Development of stereo
Normal
Monocularly deprived
Late development of stereo
Susan Barry: Learning To See In 3-D NOVA
Acuity
Binocular stereo
Monkeys
Humans
Contrast sensitivity
It has been suggested that Dutch Old Master Rembrandt may have been stereoblind, which would have aided him in flattening what he saw for the production of 2D works.
(NYT: A Defect That May Lead to a Masterpiece; June 13, 2011)
Stereo-blindness
More artists seem to have stereoblindness when compared with a sample of people with stereo-acuteness (normal stereo vision).
…the researchers obtained portraits of 121 famous artists and 127 members of Congress from the National Gallery of Art and the photographic archives of the Smithsonian American Art Museum…The eyes of the established artists were more often misaligned.
A woman named "Elizabeth," was studied and written about by Charles F. Stromeyer in 1970. She was an artist and teacher at Harvard who could mentally project detailed and exact images onto her canvas and was even able to move her eyes about to inspect the image while the image stayed still. She could also reproduce poems in a foreign language years after having seen the original printed page.
In Stromeyer's tests on her abilities, "Elizabeth" was presented with a 10,000-dot stereogram pattern to one eye for a specified length of time and then was asked to superimpose her eidetic image onto another pattern presented to her other eye. She was able to perform this task with great ease and could see depth and figures in these patterns. Non-eidetikers need a stereoscope to perform this feat.
"Elizabeth" was also capable of projecting her eidetic images onto other images, often obscuring the actual image. Her eidetic images were capable of after-images and movement after-effects just like that of actual visual stimulus, and she is even reported to have been able to see a 10-second section of a movie in complete eidetic detail.
Her only constraint was that she had to move her eyes to scan an eidetic image and generally would create the image in sections rather than as a whole. Also, "Elizabeth"'s images did not just fade, but instead would dim and break apart piece by piece. In any case, "Elizabeth" is the only one of her kind. Since the publication of Stromeyer's paper, no other adult eidetiker of her caliber has been found.
Day 1
Day 2
TANGENT
Fun with stereoscopes…
“Although a perfect stranger to you, and living on the reverse side of the globe,
I have taken the liberty of writing to you on a small discovery I have made in
Binocular vision in the stereoscope. I find by taking two ordinary photos of two
Different persons’ faces, the portraits being about the same sizes, and looking
About the same direction, and placing them in a stereoscope, the faces blend into
One in a most remarkable manner, producing in the case of some ladies’
Portraits, in every instance, a decided improvement in beauty.”
- From a letter to Charles Darwin by A. L. Austin of New Zealand
TANGENT
Composite of 14 criminals’ faces
Composite of 15 women’s faces
But, see Perrett, 1994
Depth Perception
Monocular cues
Linear perspective
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Aerial perspective
Accommodation
Kinetic Depth Effect
Parallax microscopy
Motion-based cues
Motion parallax
Binocular cues
Convergence
Stereopsis
Depth Perception
Monocular cues
Linear perspective
Relative size
Texture gradient
Interposition
Shading and shadows
Defocus
Aerial perspective
Accommodation
Expanding optic flow
Motion-based cues
Motion parallax
Optic flow
Binocular cues
Convergence
Stereopsis
Processing Framework Proposed by Marr
Recognition
Shape
From
stereo
Motion
flow
Shape
From
motion
Color
estimation
Shape
From
contour
Shape
From
shading
Shape
From
texture
3D structure; motion characteristics; surface properties
Edge extraction
Image
Processing Framework Proposed by Marr
Recognition
Shape
From
stereo
Motion
flow
Shape
From
motion
Color
estimation
Shape
From
contour
Shape
From
shading
Shape
From
texture
3D structure; motion characteristics; surface properties
Edge extraction
Image
Motion Perception:
How can we tell whether this is
really a motion selective cell (rather
than just an orientation selective one)?
How can we design a simple motion detector?
Motion as space-time orientation:
Simple motion detectors
Desired rf structure
to detect oriented
patterns in space-time
How can such rfs be constructed?
Constructing motion detectors:
Delay and compare networks
Other ways of constructing movement detectors:
Are there really s-t oriented rfs in the brain?
Is this all there is to determining whether a pattern is in motion?
Accounting for eye-motion
Q. When do we see an object move?
A. When its image moves on the retina.
Is this really true?
Accounting for eye-motion (contd.)
The corollary discharge model (Teuber, 1960)
Predictions: 1. Pushing on the eyeball would cause the world to --------
2. A stabilized after-image would appear to ------- when the eye is
moved voluntarily
3. If your eye was paralyzed with curare and you then attempted to
move it, you would see the world --------
Len Matin, Science, 1982
Interim summary:
We roughly understand how to construct simple motion detectors.
Are such detectors sufficient for estimating the motion of complex patterns
in the environment?
From local motion estimates to global ones:
Local motion estimates are ambiguous due to the ‘Aperture Problem’
So, how can we derive the global motion field?
From local motion estimates to global ones (contd):
Theoretically, the ‘aperture problem’ can be overcome by pooling
information across multiple contours or by --------------.
What happens if we remove ---------?
Subjective plaids video
Sinha, 1996
From local motion estimates to global ones - physiology:
Component motion
selective cells
Pattern motion
selective cells
Motion fields for more complex patterns:
Hildreth (1985): Smoothness of velocity field along the contour
True motion
field
Local motion
estimates
Smoothest
Velocity field
Is there any perceptual evidence for the validity of this idea?
Motion fields for more complex patterns (contd.):
True
Local
Smoothest
True
Local
Smoothest
Motion fields for more complex patterns
Processing Framework Proposed by Marr
Recognition
Shape
From
stereo
Motion
flow
Shape
From
motion
Color
estimation
Shape
From
contour
Shape
From
shading
Shape
From
texture
3D structure; motion characteristics; surface properties
Edge extraction
Image
On a computer screen, different RGB values produce different colors.
In the other direction, to determine what color an object is, we just need to measure the relative RGB values.
Color perception
Intuition suggests…
On the one hand,
Identical RGB values can yield different color percepts
On the other hand,
Different RGB values can yield identical color percepts
The challenge of explaining color perception...
Why does this happen?
In most circumstances, we are interested in determining surface color (‘reflectance’)
Surface reflectance cannot be inferred directly from image luminance;
because
the effects of illumination need to be taken into account
Why isn’t RGB information perfectly correlated with surface reflectance?
Despite the confounding effects of illumination, we typically have good lightness constancy!
Illumination (I) * Reflectance (R) = Luminance (L)
Lightness Constancy:
The constancy in perceived surface reflectance regardless of differences
in illumination.
Goal: Given L, recover R.
Clearly underconstrained. Assumptions are needed for unique solutions.
Luminance (L) = Reflectance (R) * Illumination (I)
e.g. The text in a newspaper looks black and the background looks white whether we are indoors or outdoors
The computational challenge:
Lightness Constancy:
Helmholtz’s theory: Observer can cognitively reason about the illumination and shape distribution in the scene based on past experience.
Lightness Constancy:
Helmholtz’s theory: Observer can cognitively reason about the illumination and shape distribution in the scene based on past experience.
Hering, Wallach:
Lightness matching experiments
Spots of light
A
B
How are subjects able to accurately match the reflectances of A and B?
Reference pile
Match pile
Hering, Wallach:
Observer simply computes luminance ratios across edges and does not need to perform any experience-driven high-level analyses about shape or illumination.
Are lum. Ratios across
edges perceptually important?
Craik-O’Brien-Cornsweet Illusion
The perceptual importance of luminance ratios at edges:
The perceptual importance of luminance ratios at edges:
Can lum. Ratios be used to explain any other illusions?
Explaining simultaneous contrast illusions via low-level accounts:
A
B
C
D
The A/C luminance ratio is much lower than B/D.
Hence A looks darker than B.
But, there are alternative high-level explanations…
Explaining simultaneous contrast illusions via high-level analysis:
The gradient in the background is likely due to a gradual shadow (we have seen fuzzy shadows in the past). If a patch in shadow (the one on the right) can have the same luminance as the one in light, then it must intrinsically have a higher reflectance. Hence the right patch looks brighter than the left patch.
Purves D, Lotto B (2011) Why We See What We Do Redux: A Wholly Empirical Theory of Vision. Sunderland, MA: Sinauer Associates.
…the visual system can only solve this problem on the basis of past experience. In so far as the stimulus is consistent with the past experience of the visual system with differently reflective objects in different levels of illumination, the targets will tend to appear differently light or bright. Because the standard simultaneous brightness contrast stimulus is consistent with either of these possible sources, the pattern of neural activity elicited - that is, the percept experienced when looking at the simultaneous contrast display is a manifestation of both possibilities (and indeed all of the many other possibilities not illustrated) in proportion to their relative frequency of occurrence in past experience with stimuli of this general sort.
Distinguishing between high-level and low-level mechanisms in lightness perception has been a major challenge.
Here are some attempts…
Are ratios taken with actual or perceived luminances?
Inferences:
2. High-level factors seem to be unable to overwhelm low-level factors.
Purves D, Lotto B (2011) Why We See What We Do Redux: A Wholly Empirical Theory of Vision. Sunderland, MA: Sinauer Associates.
…the visual system can only solve this problem on the basis of past experience. In so far as the stimulus is consistent with the past experience of the visual system with differently reflective objects in different levels of illumination, the targets will tend to appear differently light or bright. Because the standard simultaneous brightness contrast stimulus is consistent with either of these possible sources, the pattern of neural activity elicited - that is, the percept experienced when looking at the simultaneous contrast display is a manifestation of both possibilities (and indeed all of the many other possibilities not illustrated) in proportion to their relative frequency of occurrence in past experience with stimuli of this general sort.
Inferences:
2. High-level factors seem to be unable to overwhelm low-level factors.
How can we directly address the role of experience in the genesis of simultaneous brightness contrast?
Perception. 2009;38(1):30-43.
Simultaneous color contrast in 4-month-old infants.
The present paper addresses the question of simultaneous color contrast in 4-month-old human infants. A temporal modulation paradigm was employed for infant testing. In this paradigm, infants viewed two test disks presented side-by-side: one of unchanging chromaticity (static) and another of the chromaticity varied in time (temporally modulated). The test stimuli were embedded in a surround that was either static or temporally modulated in phase with the modulated test stimulus. The temporally modulated test stimuli were chosen in such a way as to appear static to adults when viewed in the temporally modulated surround. On the basis of the observation that infants prefer to look more at flickering stimuli, the prediction is that, if infants have adult-like simultaneous color contrast, their preference for the temporally modulated stimulus should decrease and their preference for the static stimulus should increase when the surround is also temporally modulated as described. In concordance with this prediction, a significant increase in preference for the temporally static stimuli was observed with the introduction of temporal modulation in the surround. The data are consistent with the conclusion that infants as young as 4 months of age have simultaneous color contrast.
But, some learning could have occurred over 4 months
We conducted tests with 9 children within 2 days of their first eye surgery
126
A
B
127
A
B
128
A
B
129
A
B
130
A
B
131
A
B
132
A
B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B B A A B
A B B A A A B
A B B B A A B
Controls
Newly sighted
The newly treated children were susceptible to these illusions immediately after the onset of sight. These results argue for explanations of four classic illusions that do not depend upon experience with the visual world and three-dimensional layouts in the scene (Helmholtz, 1910), but rather relate to more basic visual mechanisms and low-level aspects of the displays.
The classic simultaneous brightness contrast illusion is likely driven by low-level innately available aspects of the visual circuitry and does not require visual experience.
Inference
Inferences:
brightness perception.
2. High-level factors seem to be unable to overwhelm
low-level factors at all strengths tested.
Open questions:
1. Do the responses of neurons at different stages of the
Visual pathway co-vary with the physical or perceived brightness?
2. What is the extent of the context that participates in brightness
perception?
3. Are there differences in response latencies for brightness phenomena
that are due to low-level factors versus those that are believed to be
due to high-level inferences?
Frame 1
Frame 2
Frame 1
Frame 2
Inspired by the significance of local ratios, Land and McCann proposed a theory of lightness…
Land and McCann’s Retinex theory:
*
I
R
L
Given L, recover R
Can humans do this?
Inspired by the significance of local ratios, Land and McCann proposed a theory of lightness…
Land and McCann’s Retinex theory of lightness perception:
Estimating R from L
Land and McCann’s Retinex theory of lightness perception:
Land and McCann’s Retinex theory:
*
I
R
L
Given L, recover R
What assumptions can make this tractable?
Land and McCann’s Retinex theory - Assumptions:
luminance variations are due
to changes in reflectance.
Reflectance always changes
abruptly.
across a scene.
Basic idea: Preserve luminance ratios at edges and discard slow variations.
(aka a Mondrian world)
A Mondrian
L
Differentiate
Threshold
Integrate
R
How should we assign an absolute lightness to a surface?
The lightness scaling problem:
Q. How should we assign an absolute lightness to a surface?
A. Anchoring – brightest region in field of view is declared to be ‘white’
An excellent visual illusion!
http://www.psy.ritsumei.ac.jp/~akitaoka/illgelbe.html
A major open challenge for lightness perception models:
Moving beyond a flat world; distinguishing between abrupt orientation
changes and reflectance edges.
Retinex within faces
Edge labeling
Lightnesses
Processing Framework Proposed by Marr
Recognition
Shape
From
stereo
Motion
flow
Shape
From
motion
Color
estimation
Shape
From
contour
Shape
From
shading
Shape
From
texture
3D structure; motion characteristics; surface properties
Edge extraction
Image