1 of 49

Visual Coherence

Virtual and Augmented reality

Zuzana Berger Haladová

2 of 49

VR rendering

Computer graphics completely models real scene

Geometry accessible for calculations

Occlusions

Lighting

Shadows

Camera parameters

3 of 49

AR rendering

Less information available

Final image through compositing

Digital in video see-through

Physical in optical see-through, projection

Approximate real world to simulate interaction

geometry, appearance
post processing

5 of 49

Depth cues

Relative size

Relative height

Perspective

Surface detail

Atmospheric attenuation

Occlusion

Shading

Shadows

6 of 49

Occlusion

Virtual in front of real: Draw augmentation on top of video background
Virtual behind real: Need strategy to distinguish visible from occluded augmentations

7 of 49

Phantom rendering

Render registered virtual representations (Phantoms) of real objects

Occlusions handled by graphics hardware

Draw Video
Disable writing to color buffer
(glColorMask or glBlendFunc(0,1))
Render phantoms of real scene ➔ sets depth buffer
Enable writing to color buffer
Draw virtual objects

8 of 49

Phantom rendering

Requires accurate:

Model
Tracking data
Registration

9 of 49

Edge occlusion

A possible approach for occlusion refinement can be performed purely on the GPU. First, edges are detected in the video image and matched with edges of the virtual models. The corrected edges are then superimposed with alpha blending on top of the polygon from which they were derived.

10 of 49

Edge occlusion

Search near the projected edge of a phantom object for the true edge of the corresponding real objects to get true occlusion boundaries

11 of 49

Model Free occlusion

Instead of tracked & registered phantom model
Construct depth map from video using Computer Vision, Stereo, Shading, Structured Light etc.
Consider Performance

12 of 49

Lighting of virtual objects

More than 1 light source

Complex light sources - area, inter-reflection, color
Global illumination through indirect light

How to get lighting of the real scene?

Shiny spheres
Omnidirectional camera
Fisheye camera

Environment map: an efficient representation of the illumination an object receives from its surroundings

13 of 49

Differential rendering

Real lighting on virtual objects
Virtual representation of real scene
Virtual effects on real scene
Calculate difference and apply

14 of 49

Differential Path tracing

Realtime path tracing enables realistic global illumination effects, as demonstrated in this comparison of local (left) and global (right) illumination rendering for augmented reality

15 of 49

Reflections, Refractions...

16 of 49

Diminished reality

For marker removal- texture synthesis from area around the marker

17 of 49

Fooling in VR

18 of 49

Low quality rendering

How to distract user to not percieve low quality rendering

Let user concentrate on different part of the image
Use other senses

Smell (of freshly cutted grass)
Audio
Haptic

19 of 49

Infinite walking

How to achieve infinite walking in limited space?

Thread mills
Fooling the brain

20 of 49

Infinite walking

Fool the brain

21 of 49

Infinite walking

Overlapping rooms/corridors

Generated on the fly

22 of 49

Infinite walking

Dynamic saccadic redirection
Change virtual camera during saccades

23 of 49

Visual tracking

24 of 49

Natural Features

Detect interest points in image

Invariant to viewpoint changes
Requires textured surfaces

Create descriptors of interest points

Compact
Easy to compare

Match interest points to tracking model database
Individual natural interest points too easily confused
Probabilistic search methods used to deal with errors

Vocabulary trees
Random sampling consensus

25 of 49

Marker tracking

Capturing image with known camera
Search for quadrilaterals
Pose estimation from homography
Pose refinement
Minimize nonlinear projection error
Use final pose

26 of 49

Marker tracking

Threshold image (adaptively)
Find edges (black pixel after white) on every n-th line
Find quads
Determine orientation from black corner at si=(pi+m)/2

27 of 49

Marker tracking

Grayscale

Threshold image (adaptively)

Determine threshold locally (e.g. for 4x4 neighbourhood) based on the gradient of the logarithm of image intensities

Interpolate linearly over the image

28 of 49

Marker tracking

Find edges (black pixel after white) on every n-th line
Follow edge in 4 connected neighborhood until loop closed or hitting border
Start at a and walk contour, search p1 at maximum distance
Compute centroid m
Find corners p2, p3 on either side of d1,m=(p2,p3)
Find farthest point p4
Determine orientation from black corner at si

29 of 49

Marker tracking

Marker corners lie in plane
Express 3D point q as homogeneous point q’=[qx,qy,1]
Mapping = homography
p = H q’
Estimate H using direct linear transformation
Recover pose R,t from H=K[RC1|RC2|t]
K camera calibration

30 of 49

Camera calibration

Pinhole camera
Internal parameters
External parameters
Lens distortion

31 of 49

Lens distortion

Radial, tangential

32 of 49

Lens distortion

Radial, tangential

r - radial distortion

t - tangential distortion,

p- thin prism parameters

A:r

B: t,p

C: t

D: r,t,p

33 of 49

Multiple-Camera Infrared Tracking

Blob detection in all images (centroid of connected regions)

Establish point correspondences between blobs using epipolar geometry

Triangulation of 3D points from multiple 2D points

Matching of 3D candidate points to target points

Compute target pose (absolute orientation)

34 of 49

Sparse vs. Dense tracking

Sparse tracking

Correspondence for a small but sufficient number of salient points
Easier to produce
Compact, can be efficiently stored and processed
Handled independently (In case of occlusion, illumination change tracker not disturbed)
Resilience against background clutter

Dense tracking

Correspondence for every pixel in image
Stable even for non textured, repetitive structures, reflective materials
Better accommodate to image noise
Higher computation cost

35 of 49

Tracking by detection

Targets are detected every frame

Detection and pose estimation are solved simultaneously

Interest point detection
Descriptor creation
Descriptor matching
Perspective-n-Point camera pose determination
Robust pose estimation

36 of 49

Tracking by detection

Interest point detection

FAST, Harris, SIFT, SURF

Descriptor creation

SIFT, SURF, BRIEF, ORB, BRISK

37 of 49

Tracking by detection

Descriptor matching

Feature database stored as k-d tree for sub-linear search time

Perspective-n-Point camera pose determination

P3P computes the distance di from the camera center c to a 3D point qi
Minimum 3 2D-3D correspondences
6 DOF pose of calibrated camera

Robust pose estimation

E.g. RANSAC

Read more in https://www.robots.ox.ac.uk/~vgg/hzbook/

38 of 49

Vuforia features

Descriptor matching

Feature database stored as k-d tree for sub-linear search time

Perspective-n-Point camera pose determination

P3P computes the distance di from the camera center c to a 3D point qi
Minimum 3 2D-3D correspondences
6 DOF pose of calibrated camera

Robust pose estimation

E.g. RANSAC

39 of 49

Detection and (incremental) tracking

Tracking and detection are complementary approaches.

After successful detection, the target is tracked incrementally.

If the target is lost, the detection is activated again

40 of 49

Incremental tracking

Local search: matching of IP in small window around the prior position
Direct matching: compare image patches around the track point to the patch in following image. No descriptor creation

Incremental tracking relies on good prior information on camera pose.

Active search: the initial estimate of camera is extrapolated from the last known pose using a motion model. (First-order model)

41 of 49

Patch tracking

Target detection

Find features in reference image

Target tracking

Take previous pose and apply motion model- get estimate for what we are looking for
Create affine warp patches of reference features- closely resemble how the feature should look in camera image
Project patches into camera image and match (Normalized cross correlation)

42 of 49

Normalized cross correlation

Nájdi v obraze

Ktorú mieru podobnosti použiť?

Korelácia
Korelácia s nulovým priemerom
Suma štvorcov vzdialeností
Normalizovaná korelácia

43 of 49

Motion model

Active search in 2D vs 3D

44 of 49

Patch tracking

45 of 49

Hierarchical matching

46 of 49

Patch tracking

Invariant to strong affine transformations (tilt close to 90°)

NCC allows severe lighting changes

47 of 49

Tracking approaches

Model first:

Marker tracking

Natural features tracking

Build a model while tracking:

SLAM (Simultaneous localization and mapping)

PTAM

Kinect fusion

48 of 49

PTAM

Keyframe SLAM

Keyframes added when baseline is high

Check all FAST corners

49 of 49

References

Bimber, O., & Raskar, R. (2005). Spatial augmented reality: merging real and virtual worlds. AK Peters/CRC Press.

Hainich, R. R., & Bimber, O. (2016). Displays: fundamentals & applications. AK Peters/CRC Press.

Schmalstieg, D., & Hollerer, T. (2016). Augmented reality: principles and practice. Addison-Wesley Professional.