1 of 49

Visual Coherence

Virtual and Augmented reality

Zuzana Berger Haladová

2 of 49

VR rendering

Computer graphics completely models real scene

Geometry accessible for calculations

Occlusions

Lighting

Shadows

Camera parameters

3 of 49

AR rendering

Less information available

Final image through compositing

Digital in video see-through

Physical in optical see-through, projection

Approximate real world to simulate interaction

  • geometry, appearance
  • post processing

4 of 49

Depth cues

5 of 49

Depth cues

Relative size

Relative height

Perspective

Surface detail

Atmospheric attenuation

Occlusion

Shading

Shadows

6 of 49

Occlusion

  • Virtual in front of real: Draw augmentation on top of video background
  • Virtual behind real: Need strategy to distinguish visible from occluded augmentations

7 of 49

Phantom rendering

Render registered virtual representations (Phantoms) of real objects

Occlusions handled by graphics hardware

  • Draw Video
  • Disable writing to color buffer
  • (glColorMask or glBlendFunc(0,1))
  • Render phantoms of real scene ➔ sets depth buffer
  • Enable writing to color buffer
  • Draw virtual objects

8 of 49

Phantom rendering

Requires accurate:

  • Model
  • Tracking data
  • Registration

9 of 49

Edge occlusion

A possible approach for occlusion refinement can be performed purely on the GPU. First, edges are detected in the video image and matched with edges of the virtual models. The corrected edges are then superimposed with alpha blending on top of the polygon from which they were derived.

10 of 49

Edge occlusion

Search near the projected edge of a phantom object for the true edge of the corresponding real objects to get true occlusion boundaries

11 of 49

Model Free occlusion

  • Instead of tracked & registered phantom model
  • Construct depth map from video using Computer Vision, Stereo, Shading, Structured Light etc.
  • Consider Performance

12 of 49

Lighting of virtual objects

  • More than 1 light source
    • Complex light sources - area, inter-reflection, color
    • Global illumination through indirect light
  • How to get lighting of the real scene?
    • Shiny spheres
    • Omnidirectional camera
    • Fisheye camera

Environment map: an efficient representation of the illumination an object receives from its surroundings

13 of 49

Differential rendering

  • Real lighting on virtual objects
  • Virtual representation of real scene
  • Virtual effects on real scene
  • Calculate difference and apply

14 of 49

Differential Path tracing

Realtime path tracing enables realistic global illumination effects, as demonstrated in this comparison of local (left) and global (right) illumination rendering for augmented reality

15 of 49

Reflections, Refractions...

16 of 49

Diminished reality

For marker removal- texture synthesis from area around the marker

17 of 49

Fooling in VR

18 of 49

Low quality rendering

How to distract user to not percieve low quality rendering

  • Let user concentrate on different part of the image
  • Use other senses
    • Smell (of freshly cutted grass)
    • Audio
    • Haptic

19 of 49

Infinite walking

How to achieve infinite walking in limited space?

  • Thread mills
  • Fooling the brain

20 of 49

Infinite walking

Fool the brain

21 of 49

Infinite walking

Overlapping rooms/corridors

Generated on the fly

22 of 49

Infinite walking

  • Dynamic saccadic redirection
  • Change virtual camera during saccades

23 of 49

Visual tracking

24 of 49

Natural Features

  • Detect interest points in image
    • Invariant to viewpoint changes
    • Requires textured surfaces
  • Create descriptors of interest points
    • Compact
    • Easy to compare
  • Match interest points to tracking model database
  • Individual natural interest points too easily confused
  • Probabilistic search methods used to deal with errors
    • Vocabulary trees
    • Random sampling consensus

25 of 49

Marker tracking

  • Capturing image with known camera
  • Search for quadrilaterals
  • Pose estimation from homography
  • Pose refinement
  • Minimize nonlinear projection error
  • Use final pose

26 of 49

Marker tracking

  • Threshold image (adaptively)
  • Find edges (black pixel after white) on every n-th line
  • Find quads
  • Determine orientation from black corner at si=(pi+m)/2

27 of 49

Marker tracking

Grayscale

Threshold image (adaptively)

Determine threshold locally (e.g. for 4x4 neighbourhood) based on the gradient of the logarithm of image intensities

Interpolate linearly over the image

28 of 49

Marker tracking

  • Find edges (black pixel after white) on every n-th line
  • Follow edge in 4 connected neighborhood until loop closed or hitting border
  • Start at a and walk contour, search p1 at maximum distance
  • Compute centroid m
  • Find corners p2, p3 on either side of d1,m=(p2,p3)
  • Find farthest point p4
  • Determine orientation from black corner at si

29 of 49

Marker tracking

  • Marker corners lie in plane
  • Express 3D point q as homogeneous point q’=[qx,qy,1]
  • Mapping = homography
  • p = H q’
  • Estimate H using direct linear transformation
  • Recover pose R,t from H=K[RC1|RC2|t]
  • K camera calibration

30 of 49

Camera calibration

  • Pinhole camera
  • Internal parameters
  • External parameters
  • Lens distortion

31 of 49

Lens distortion

  • Radial, tangential

32 of 49

Lens distortion

  • Radial, tangential

r - radial distortion

t - tangential distortion,

p- thin prism parameters

A:r

B: t,p

C: t

D: r,t,p

33 of 49

Multiple-Camera Infrared Tracking

Blob detection in all images (centroid of connected regions)

Establish point correspondences between blobs using epipolar geometry

Triangulation of 3D points from multiple 2D points

Matching of 3D candidate points to target points

Compute target pose (absolute orientation)

34 of 49

Sparse vs. Dense tracking

  • Sparse tracking
    • Correspondence for a small but sufficient number of salient points
    • Easier to produce
    • Compact, can be efficiently stored and processed
    • Handled independently (In case of occlusion, illumination change tracker not disturbed)
    • Resilience against background clutter

  • Dense tracking
    • Correspondence for every pixel in image
    • Stable even for non textured, repetitive structures, reflective materials
    • Better accommodate to image noise
    • Higher computation cost

35 of 49

Tracking by detection

Targets are detected every frame

Detection and pose estimation are solved simultaneously

  • Interest point detection
  • Descriptor creation
  • Descriptor matching
  • Perspective-n-Point camera pose determination
  • Robust pose estimation

36 of 49

Tracking by detection

  • Interest point detection
    • FAST, Harris, SIFT, SURF
  • Descriptor creation
    • SIFT, SURF, BRIEF, ORB, BRISK

37 of 49

Tracking by detection

  • Descriptor matching
    • Feature database stored as k-d tree for sub-linear search time
  • Perspective-n-Point camera pose determination
    • P3P computes the distance di from the camera center c to a 3D point qi
    • Minimum 3 2D-3D correspondences
    • 6 DOF pose of calibrated camera
  • Robust pose estimation
    • E.g. RANSAC

Read more in https://www.robots.ox.ac.uk/~vgg/hzbook/

38 of 49

Vuforia features

  • Descriptor matching
    • Feature database stored as k-d tree for sub-linear search time
  • Perspective-n-Point camera pose determination
    • P3P computes the distance di from the camera center c to a 3D point qi
    • Minimum 3 2D-3D correspondences
    • 6 DOF pose of calibrated camera
  • Robust pose estimation
    • E.g. RANSAC

39 of 49

Detection and (incremental) tracking

Tracking and detection are complementary approaches.

After successful detection, the target is tracked incrementally.

If the target is lost, the detection is activated again

40 of 49

Incremental tracking

  • Local search: matching of IP in small window around the prior position
  • Direct matching: compare image patches around the track point to the patch in following image. No descriptor creation

Incremental tracking relies on good prior information on camera pose.

Active search: the initial estimate of camera is extrapolated from the last known pose using a motion model. (First-order model)

41 of 49

Patch tracking

Target detection

Find features in reference image

Target tracking

  • Take previous pose and apply motion model- get estimate for what we are looking for
  • Create affine warp patches of reference features- closely resemble how the feature should look in camera image
  • Project patches into camera image and match (Normalized cross correlation)

42 of 49

Normalized cross correlation

Nájdi v obraze

Ktorú mieru podobnosti použiť?

  • Korelácia
  • Korelácia s nulovým priemerom
  • Suma štvorcov vzdialeností
  • Normalizovaná korelácia

43 of 49

Motion model

Active search in 2D vs 3D

44 of 49

Patch tracking

45 of 49

Hierarchical matching

46 of 49

Patch tracking

Invariant to strong affine transformations (tilt close to 90°)

NCC allows severe lighting changes

47 of 49

Tracking approaches

Model first:

Marker tracking

Natural features tracking

Build a model while tracking:

SLAM (Simultaneous localization and mapping)

PTAM

Kinect fusion

48 of 49

PTAM

Keyframe SLAM

Keyframes added when baseline is high

Check all FAST corners

49 of 49

References

Bimber, O., & Raskar, R. (2005). Spatial augmented reality: merging real and virtual worlds. AK Peters/CRC Press.

Hainich, R. R., & Bimber, O. (2016). Displays: fundamentals & applications. AK Peters/CRC Press.

Schmalstieg, D., & Hollerer, T. (2016). Augmented reality: principles and practice. Addison-Wesley Professional.