Interactive Graphics for a
Universally Accessible Metaverse
Ruofei Du
Senior Research Scientist
Google Labs, San Francisco
Twitter: DuRuofei@
me@duruofei.com
Self Intro
www.duruofei.com
Self Intro
Ruofei Du (杜若飞)
Self Intro
Ruofei Du (杜若飞)
Self Intro
Ruofei Du (杜若飞)
Human-Computer Interaction
Geollery �CHI '19, Web3D '19, VR '19
Social Street View Web3D '16 Best Paper Award
VideoFields
Web3D '16
SketchyScene
TOG (SIGGRAPH Asia) '19, ECCV '18
Montage4D
I3D '18
JCGT '19
DepthLab UIST '20
13K Installs
Kernel Foveated Rendering�I3D '18, VR '20, TVCG '20
CollaboVR ISMAR '20
LogRectilinear
IEEE VR '21 (TVCG)
TVCG Honorable Mention
GazeChat
UIST '21
Computer Graphics
MDIF�ICCV' 21
HumanGPS�CVPR' 21
HandSight
ECCVW '14
TACCESS '15
Ad hoc UI
CHIEA '22
ProtoSound
CHI ‘22
PRIF�ECCV' 22
Computer
Vision
Rapsai
CHI ‘23
Visual Captions
ThingShare
CHI ‘23
Self Intro
Ruofei Du (杜若飞)
Interaction and Communication
Geollery �CHI '19, Web3D '19, VR '19
Social Street View Web3D '16
Best Paper Award
VideoFields
Web3D '16
SketchyScene
TOG (SIGGRAPH Asia) '19, ECCV '18
Montage4D
I3D '18
JCGT '19
DepthLab UIST '20
13K Installs
Kernel Foveated Rendering�I3D '18, VR '20, TVCG '20
CollaboVR ISMAR '20
LogRectilinear
IEEE VR '21 (TVCG)
TVCG Honorable Mention
GazeChat
UIST '21
Digital World
Digital Human
HumanGPS�CVPR' 21
HandSight
ECCVW '14
TACCESS '15
ProtoSound
CHI ‘22
Ad hoc UI
CHIEA '22
OmniSyn
IEEE VR '22
SlurpAR DIS '22
Visual Captions
ThingShare
CHI ‘23
Rapsai
CHI ‘23
Interactive Graphics for a
Universally Accessible Metaverse
Metaverse
How Metaverse is defined by
academia and industry?
Neal Stephenson, 1992.
Metaverse
Metaverse
Future of Internet?
Internet of Things?
Virtual Reality?
Augmented Reality?
Decentralization?
Blockchain + NFT?
Mirrored World?
Digital Twin?
VR OS?
Web 3.0?
The Future of Internet
Internet of Things
Virtual Reality
Augmented Reality
Decentralization
Blockchain
NFT
Mirrored World
Metaverse
Digital Twin
VR OS
Web 3.0
Extended Reality (XR)
Accessibility
Avatars
Co-presence
Economics
Gaming
Wearable
AI
Privacy
Security
Vision
Neural
How do I define Metaverse?
More importantly, what research
directions shall we devote to Metaverse?
Metaverse
Metaverse envisioned a persistent digital world where people are fully connected as virtual representations,
As a teenager, my dream was to live in a metaverse...
However, today I wish metaverse is only a tool to make information more useful and accessible and help people to live a better physical life.
Interactive Graphics for a Universally Accessible Metaverse
Chapter One · Mirrored World & Real-time Rendering
Chapter Two · Computational Interaction: Algorithm & Systems
Chapter Three · Digital Human & Augmented Communication
Interactive Graphics for a Universally Accessible Metaverse
Chapter One · Mirrored World & Real-time Rendering
Geollery �CHI '19, Web3D '19, VRW '19
Social Street View Web3D '16
Best Paper Award
Kernel Foveated Rendering�I3D '18, VR '20, TVCG '20
LogRectilinear, OmniSyn
IEEE VR '21 (TVCG), VRW ‘22
TVCG Honorable Mention
How about a little bit dating back?
Project Geollery.com & Social Street View: Reconstructing a Live Mirrored World With Geotagged Social Media
Ruofei Du†, David Li†, and Amitabh Varshney
{ruofei, dli7319, varshney}@umiacs.umd.edu | www.Geollery.com | ACM CHI 2019 & Web3D 2016 Best Paper Award & 2019
UMIACS
THE AUGMENTARIUM
VIRTUAL AND AUGMENTED REALITY LAB
AT THE UNIVERSITY OF MARYLAND
COMPUTER SCIENCE
UNIVERSITY OF MARYLAND, COLLEGE PARK
Introduction
Social Media
26
image courtesy: plannedparenthood.org
Introduction
Social Media + Topics
27
image courtesy: huffingtonpost.com
Motivation
Social Media + XR
28
Motivation
Social Media + XR
29
image courtesy:
instagram.com,
facebook.com,
twitter.com
Motivation
2D layout
30
image courtesy:
pinterest.com
Motivation
Immersive Mixed Reality?
31
image courtesy:
viralized.com
Motivation
Pros and cons of the classic
32
Motivation
Pros and cons of the classic
33
Related Work
Social Street View, Du and Varshney
Web3D 2016 Best Paper Award
34
Technical Challenges?
Related Work
Social Street View, Du and Varshney
Web3D 2016 Best Paper Award
36
Related Work
Social Street View, Du and Varshney
Web3D 2016 Best Paper Award
37
Related Work
3D Visual Popularity
Bulbul and Dahyot, 2017
38
Related Work
Virtual Oulu, Kukka et al.
CSCW 2017
39
Related Work
Immersive Trip Reports
Brejcha et al. UIST 2018
40
Related Work
High Fidelity, Inc.
41
Related Work
Facebook Spaces, 2017
42
What's Next?
Research Question 1/3
43
What may a social media platform look like in mixed reality?
What's Next?
Research Question 2/3
44
What if we could allow social media sharing in a live mirrored world?
What's Next?
Research Question 3/3
45
What use cases can we benefit from social media platform in XR?
Geollery.com
A Mixed-Reality Social Media Platform
46
Geollery.com
A Mixed-Reality Social Media Platform
47
48
1
Conception, architecting & implementation
Geollery
A mixed reality system that can depict geotagged social media and online avatars with 3D textured buildings.
49
2
Extending the design space of
3D Social Media Platform
Progressive streaming, aggregation approaches, virtual representation of social media, co-presence with virtual avatars, and collaboration modes.
50
3
Conducting a user study of
Geollery vs. Social Street View
by discussing their benefits, limitations, and potential impacts to future 3D social media platforms.
System Overview
Geollery Workflow
51
System Overview
Geollery Workflow
52
Geollery.com
v2: a major leap
53
System Overview
Geollery Workflow
54
System Overview
2D Map Data
55
System Overview
2D Map Data
56
System Overview
+Avatar +Trees +Clouds
57
System Overview
+Avatar +Trees +Clouds +Night
58
System Overview
Street View Panoramas
59
System Overview
Street View Panoramas
60
System Overview
Street View Panoramas
61
System Overview
Geollery Workflow
62
All data we used is publicly and widely available on the Internet.
Rendering Pipeline
Close-view Rendering
63
Rendering Pipeline
Initial spherical geometries
64
Rendering Pipeline
Depth correction
65
Rendering Pipeline
Intersection removal
66
Rendering Pipeline
Texturing individual geometry
67
Rendering Pipeline
Texturing with alpha blending
68
Rendering Pipeline
Rendering result in the fine detail
69
Rendering Pipeline
Rendering result in the fine detail
70
Rendering Pipeline
Rendering result in the fine detail
71
User Study
Social Street View vs. Geollery
72
User Study
Quantitative Evaluation
73
User Study
Quantitative Evaluation
74
75
I would like to use it for the food in different restaurants. I am always hesitating of different restaurants. It will be very easy to see all restaurants with street views. In Yelp, I can only see one restaurant at a time.
P6 / F
76
[I will use it for] exploring new places. If I am going on vacation somewhere, I could immerse myself into the location. If there are avatars around that area, I could ask questions.
P1 / M
77
I think it (Geollery) will be useful for families. I just taught my grandpa how to use Facetime last week and it would great if I could teleport to their house and meet with them, then we could chat and share photos with our avatars.
P2 / F
78
if there is a way to unify the interaction between them, there will be more realistic buildings [and] you could have more roof structures. Terrains will be interesting to add on.
P18 / M
Rendering Pipeline
Experimental Features
79
Landing Impact
Demos at ACM CHI 2019
80
Landing Impact
Demos at ACM CHI 2019
81
Landing Impact
Demos at ACM CHI 2019
82
Instant Panoramic Texture Mapping with Semantic Object Matching for Large-Scale Urban Scene Reproduction
TVCG 2021, Jinwoo Park, Ik-beom Jeon, Student Members, Sung-eui Yoon, and Woontack Woo
Instant Panoramic Texture Mapping with Semantic Object Matching for Large-Scale Urban Scene Reproduction
TVCG 2021, Jinwoo Park, Ik-beom Jeon, Student Members, Sung-eui Yoon, and Woontack Woo
A more applicable method for constructing walk-through experiences in urban streets was employed by Geollery [16], which adopted an efficient transformation of a dense spherical mesh to construct a local proxy geometry based on the depth maps from Google Street View
Freeman et al. ACM PHCI 2022
He et al. ISMAR 2020
Park et al. Virtual Reality 2022
Yeom et al. IEEE VR 2021
What's Next?
Video Fields: Fusing Multiple Surveillance Videos into a Dynamic Virtual Environment
Ruofei Du, Sujal Bista, Amitabh Varshney
The Augmentarium| UMIACS | University of Maryland, College Park
{ruofei, varshney} @ cs.umd.edu
www.Video-Fields.com
image courtesy: university of maryland, college park
Introduction
Surveillance Videos
Architecture
Video Fields Flowchart
OmniSyn: Intermediate View Synthesis Between Wide-Baseline Panoramas
David Li, Yinda Zhang, Christian Häne, Danhang Tang, Amitabh Varshney, and Ruofei Du, VR 2022
OmniSyn: Intermediate View Synthesis Between Wide-Baseline Panoramas
David Li, Yinda Zhang, Christian Häne, Danhang Tang, Amitabh Varshney, and Ruofei Du, VR 2022
input 1
input 2
input 3
How can we further accelerate the real-time rendering procedure?
96
Kernel Foveated Rendering
Xiaoxu Meng, Ruofei Du, Matthias Zwicker and Amitabh Varshney
Augmentarium | UMIACS
University of Maryland, College Park
ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games 2018
97
Original Frame
Buffer
Screen
Sample Map
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
98
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Kernel Log-polar Mapping
99
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Can we further accelerate it?
Eye-dominance-guided�Foveated Rendering
Xiaoxu Meng, Ruofei Du, and Amitabh Varshney
IEEE Transactions on Visualization and Computer Graphics (TVCG)
fovea
fovea
more foveation for the non-dominant eye
What if we apply to 3D volume data formats?
3D-Kernel Foveated Rendering for Light Fields
Xiaoxu Meng, Ruofei Du, Joseph JaJa, and Amitabh Varshney
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2020
UMIACS
How about 360 video streaming?
A Log-Rectilinear Transformation for Foveated 360-Degree Video Streaming
David Li†, Ruofei Du‡, Adharsh Babu†, Camelia Brumar†, Amitabh Varshney†
† University of Maryland, College Park ‡ Google Research
UMIACS
TVCG Honorable Mentions Award
With recent advances of neural networks,
how can we further
compress existing graphics?
Sandwiched Image Compression
Wrapping Neural Networks Around a Standard Codec
Increasing the Resolution and Dynamic Range of Standard Codecs
Onur Guleryuz, Philip Chou, Hugues Hoppe, Danhang Tang, �Ruofei Du, Philip Davidson, and Sean Fanello
2021 IEEE International Conference on Image Processing (ICIP)�2022 Picture Coding Symposium (PCS) Best Paper Finalist
What about compressing 3D volumes?
What are levels of details?
Multiresolution Deep Implicit Functions for 3D Shape Representation
Zhang Chen, Yinda Zhang, Kyle Genova, Thomas Funkhouse, Sean Fanello, Sofien Bouaziz, Christian Häne, Ruofei Du, Cem Keskin, and Danhang Tang
2021 IEEE/CVF International Conference on Computer Vision (ICCV)
Interactive Graphics for a Universally Accessible Metaverse
Chapter Two · Computational Interaction: Algorithm & Systems
Ad hoc UI
CHI EA ‘22
DepthLab
UIST '20
13K Installs & deployed in Tiktok, Snap, Teamviewer etc.
SlurpAR�DIS ‘22
Rapsai
CHI ‘23
DepthLab: Real-time 3D Interaction with Depth Maps for Mobile Augmented Reality
Ruofei Du, Eric Turner, Maksym Dzitsiuk, Luca Prasso, Ivo Duarte,
Jason Dourgarian, Joao Afonso, Jose Pascoal, Josh Gladstone, Nuno Cruces,
Shahram Izadi, Adarsh Kowdle, Konstantine Tsotsos, David Kim
Google | ACM UIST 2020
Introduction
Mobile Augmented Reality
Introduction
Google's ARCore
Introduction
Google's ARCore
Introduction
Mobile Augmented Reality
Introduction
Motivation
Is the current generation of object placement sufficient for realistic AR experiences?
Introduction
Depth Lab
Not always!
Introduction
Depth Lab
Virtual content looks like it’s “pasted on the screen” rather than “in the world”!
Introduction
Motivation
Introduction
Depth Lab
How can we bring these advanced
features to mobile AR experiences WITHOUT relying on dedicated sensors or the need for computationally expensive surface reconstruction?
Introduction
Depth Map
Introduction
Depth Lab
•Pixel 2, Pixel 2 XL, Pixel 3, Pixel 3 XL, Pixel 3a, Pixel 3a XL, Pixel 4, Pixel 4 XL | |
Huawei | •Honor 10, Honor V20, Mate 20 Lite, Mate 20, Mate 20 X, Nova 3, Nova 4, P20, P30, P30 Pro |
LG | •G8X ThinQ, V35 ThinQ, V50S ThinQ, V60 ThinQ 5G |
OnePlus | •OnePlus 6, OnePlus 6T, OnePlus 7, OnePlus 7 Pro, OnePlus 7 Pro 5G, OnePlus 7T, OnePlus 7T Pro |
Oppo | •Reno Ace |
Samsung | •Galaxy A80, Galaxy Note8, Galaxy Note9, Galaxy Note10, Galaxy Note10 5G, Galaxy Note10+, Galaxy Note10+ 5G, Galaxy S8, Galaxy S8+, Galaxy S9, Galaxy S9+, Galaxy S10e, Galaxy S10, Galaxy S10+, Galaxy S10 5G, Galaxy S20, Galaxy S20+ 5G, Galaxy S20 Ultra 5G |
Sony | •Xperia XZ2, Xperia XZ2 Compact, Xperia XZ2 Premium, Xperia XZ3 |
Xiaomi | •Pocophone F1 |
|
Introduction
Depth Lab
Is there more to realism than occlusion?
Introduction
Depth Lab
Surface interaction?
Introduction
Depth Lab
Realistic Physics?
Introduction
Depth Lab
Path Planning?
Introduction
Depth Lab
Related Work
Valentin et al.
Depth Maps
Depth �from Motion
Depth From a Single Camera
Best Practices
Depth From a Single Camera
Use depth-certified ARCore devices
Minimal movement in the scene
Encourage users to move the device
Depth from 0 to 8 meters
Best accuracy 0.5 to 5 meters
Enhancing Depth
Optimized to give you the best depth
Depth from Motion is fused with state-of-the-art Machine Learning
Depth leverages specialized hardware like a Time-of-Flight sensor when available
Introduction
Depth Lab
Introduction
Depth Lab
Introduction
Depth Generation
Introduction
Depth Lab
Related Work
Valentin et al.
Introduction
Depth Lab
Introduction
Depth Lab
Up to 8 meters, with
the best within 0.5m to 5m
Motivation
Gap from raw depth to applications
Introduction
Depth Lab
ARCore
Depth API
DepthLab
Mobile AR developers
Design Process
3 Brainstorming Sessions
3 brainstorming sessions
18 participants
39 aggregated ideas
Design Process
3 Brainstorming Sessions
System
Architecture overview
Data Structure
Depth Array
2D array (160x120 and above) of 16-bit integers
Data Structure
Depth Mesh
Data Structure
Depth Texture
System
Architecture
Localized Depth
Coordinate System Conversion
Localized Depth
Normal Estimation
Localized Depth
Normal Estimation
Localized Depth
Normal Estimation
Localized Depth
Avatar Path Planning
Localized Depth
Rain and Snow
Surface Depth
Use Cases
Surface Depth
Physics collider
Physics with depth mesh.
Surface Depth
Texture decals
Texture decals with depth mesh.
Surface Depth
3D Photo
Projection mapping with depth mesh.
Dense Depth
Depth Texture - Antialiasing
Dense Depth
Real-time relighting
θ
N
L
Dense Depth
Why normal map does not work?
Dense Depth
Real-time relighting
Dense Depth
Real-time relighting
Dense Depth
Real-time relighting
go/realtime-relighting, go/relit
Dense Depth
Wide-aperture effect
Dense Depth
Occlusion-based rendering
Experiments
DepthLab minimum viable application
Experiments
General Profiling of MVP
Experiments
Relighting
Experiments
Aperture effects
Impact
Deployment with partners
Impact
Deployment with partners
Impact
Deployment with partners
AR Realism
In TikTok
AR Realism
Built into Lens Studio for Snapchat Lenses
Kevaid
Saving Chelon
Quixotical�The Seed: World of Anthrotopia
Snap�Dancing Hotdog
Camera Image
3D Point Cloud
Provides a more detailed representation of the geometry of the objects in the scene.
Raw Depth API
New depth capabilities
Camera Image
Raw Depth Image
Depth Image
Confidence Image
New depth capabilities
Raw Depth API
Provides a more detailed representation of the geometry of the objects in the scene.
Try it yourself!
TeamViewer�LifeAR App
ARCore�Depth Lab App
Depth �Hit Test
New depth capabilities
ARCore�Depth Lab App
Depth API�Codelab
Raw Depth API�Codelab
Limitations
Design space of dynamic depth
Dynamic Depth? HoloDesk, HyperDepth, Digits, Holoportation for mobile AR?
Envision
Design space of dynamic depth
GitHub
Please feel free to fork!
Play Store
Try it yourself!
Impact
Significant Media Coverage
Impact
Significant Media Coverage
More Links
Significant Media Coverage
WebXR + ARCore Depth: https://storage.googleapis.com/chromium-webxr-test/r991081/proposals/index.html
Hugging Face Depth: https://huggingface.co/spaces/Detomo/Depth-Estimation
ARCore Depth Lab Play Store App: https://play.google.com/store/apps/details?id=com.google.ar.unity.arcore_depth_lab
DepthLab: Real-time 3D Interaction with Depth Maps for Mobile Augmented Reality
Ruofei Du, Eric Turner, Maksym Dzitsiuk, Luca Prasso, Ivo Duarte,
Jason Dourgarian, Joao Afonso, Jose Pascoal, Josh Gladstone, Nuno Cruces,
Shahram Izadi, Adarsh Kowdle, Konstantine Tsotsos, David Kim
Google | ACM UIST 2020
Thank you!
DepthLab | UIST 2020
Demo
DepthLab | UIST 2020
DepthLab: Real-time 3D Interaction with Depth Maps for Mobile Augmented Reality
Ruofei Du, Eric Turner, Maksym Dzitsiuk, Luca Prasso, Ivo Duarte,
Jason Dourgarian, Joao Afonso, Jose Pascoal, Josh Gladstone, Nuno Cruces,
Shahram Izadi, Adarsh Kowdle, Konstantine Tsotsos, David Kim
Google | ACM UIST 2020
After exploring interaction with the environment,
how shall we interact with everyday object?
Ad hoc UI: On-the-fly Transformation of Everyday Objects
into Tangible 6DOF Interfaces for AR
Ruofei Du, Alex Olwal, Mathieu Le Goc, Shengzhi Wu, Danhang Tang,
Yinda Zhang, Jun Zhang, David Joseph Tan, Federico Tombari, David Kim
Google | CHI 2022 Interactivity
Applications
Can we learn from the history to
interact with everyday object?
“Slurp” Revisited: Using Software Reconstruction to Reflect on Spatial Interactivity and Locative Media
Shengzhi Wu, Daragh Byrne, Ruofei Du, and Molly Steenson
ACM DIS 2022
RetroSphere: Self-Contained Passive 3D Controller Tracking for Augmented Reality
Ananta Narayanan Balaji, Clayton Kimber, David Li, Shengzhi Wu, Ruofei Du, David Kim
ACM Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 2022
Ananta Narayanan Balaji, Clayton Kimber, David Li, Shengzhi Wu, Ruofei Du, David Kim
ACM Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 2022
With recent advances of on-device ML models,
how can we accelerate the prototyping efforts?
What if we can build applications as if building Legos?
rapsai
Accelerating Machine Learning Prototyping of Multimedia Applications through Visual Programming
Ruofei Du, Na Li, Jing Jin, Michelle Carney, Scott Miles, Maria Kleiner, Xiuxiu Yuan, Yinda Zhang,
Anuva Kulkarni, Xingyu "Bruce" Liu, Ahmed Sabie, Sergio Escolano, Abhishek Kar,
Ping Yu, Ram Iyengar, Adarsh Kowdle, and Alex Olwal
Interactive Graphics for a Universally Accessible Metaverse
Chapter Three · Digital Human & Augmented Communication
HumanGPS �CVPR ‘21
Montage4D
I3D '18
JCGT '19
GazeChat & CollaboVR�UIST ‘21 & ISMAR ‘20
Visual Captions & ThingShare�CHI ‘23
What is Avatar?
Avatar
History & Definition
Avatar is a term used in Hinduism for a material manifestation of a deity: “descent of a deity from a heaven”
Avatar
History & Definition
In computing, an avatar is a graphical representation of a user or the user's character or persona.
Avatar
Taxonomy
What is the oldest avatar in computer history?
Avatar
History & Definition
Avatar
History & Definition
Guo, Kaiwen, Peter Lincoln, Philip Davidson, Jay Busch, Xueming Yu, Matt Whalen, Geoff Harvey et al. "The relightables: Volumetric performance capture of humans with realistic relighting." ACM Transactions on Graphics (ToG) 38, no. 6 (2019): 1-19.
Dating back to real-time digital human / avatars…
228
What is the state-of-the-art since then?
Related Work
Fusing Multiple Dynamic Videos
ACM Trans. Graph., Vol. 40, No. 4, Article 1. SIGGRAPH 2021
Photorealistic Characters
The Relightables
Kaiwen Guo, Peter Lincoln, Philip Davidson, Jay Busch, Xueming Yu, Matt Whalen, Geoff Harvey, Sergio Orts-Escolano, Rohit Pandey, Jason Dourgarian, Danhang Tang, Anastasia Tkach, Adarsh Kowdle, Emily Cooper, Mingsong Dou, Sean Fanello, Graham Fyffe, Christoph Rhemann, Jonathan Taylor, Paul Debevec, and Shahram Izadi. 2019. The Relightables: Volumetric Performance Capture of Humans With Realistic Relighting. ACM Transactions on Graphics, pp. . DOI: https://doi.org/10.1145/3355089.3356571
ACM Trans. Graph., Vol. 40, No. 4, Article 1. SIGGRAPH 2021
Photorealistic Characters
Rocketbox
Mar Gonzalez-Franco, Eyal Ofek, Ye Pan, Angus Antley, Anthony Steed, Bernhard Spanlang, Antonella Maselli, Domna Banakou, Nuria Pelechano, Sergio Orts Escolano, Veronica Orvahlo, Laura Trutoiu, Markus Wojcik, Maria V. Sanchez-Vives, Jeremy Bailenson, Mel Slater, and Jaron Lanier "The Rocketbox library and the utility of freely available rigged avatars." Frontiers in Virtual Reality DOI: 10.3389/frvir.2020.561558
Photorealistic Characters
From phone scan
Chen Cao, Tomas Simon, Jin Kyu Kim, Gabe Schwartz, Michael Zollhoefer, Shun-Suke Saito, Stephen Lombardi, Shih-En Wei, Danielle Belko, Shoou-I Yu, Yaser Sheikh, and Jason Saragih. 2022. Authentic Volumetric Avatars From a Phone Scan. ACM Transactions on Graphics, pp. . DOI: https://doi.org/10.1145/3528223.3530143
How can we build dynamic dense correspondence
within the same subject and
among different subjects?
How can we leverage real-time Avatars today?
GazeChat
Enhancing Virtual Conferences With
Gaze-Aware 3D Photos
Zhenyi He†, Keru Wang†, Brandon Yushan Feng‡, Ruofei Du⸸, Ken Perlin†
† New York University�‡ University of Maryland, College Park �⸸ Google
Introduction
VR headset & video streaming
258
Related Work
Gaze-2 (2003)
259
Related Work
MultiView (2005)
260
Related Work
MMSpace (2016)
261
Our Work
GazeChat (UIST 2021)
262
Gaze Awareness
Definition
263
Gaze awareness, defined here as knowing what someone is looking at.
Gaze Awareness
Definition
264
gaze correction
gaze redirection
raw input image
GazeChat
Gaze Correction
Definition
265
Gaze Rediction
Definition
266
eye contact
who is looking at whom
Pipeline
System
267
Eye Tracking
WebGazer..js
268
Neural Rendering
Eye movement
269
Neural Rendering
Eye movement
270
3D Photo Rendering
3D photos
271
3D Photo Rendering
3D photos
272
Layouts
UI
273
Networking
WebRTC
274
How can we work in XR as stylized avatars?
Zhenyi He* Ruofei Du† Ken Perlin*
*Future Reality Lab, New York University †Google LLC
CollaboVR: A Reconfigurable Framework for
Creative Collaboration in Virtual Reality
How can we further augment communication,
in videoconferencing, AR, and XR in future?
Visual Captions
Augmenting Verbal Communication
With On-the-Fly Visuals
Xingyu "Bruce" Liu, Vladimir Kirilyuk, Xiuxiu Yuan, Peggy Chi,
Xiang "Anthony" Chen, Alex Olwal, and Ruofei Du
ThingShare
Ad-Hoc Digital Copies of Physical Objects for Sharing Things in Video Meetings
Erzhen Hu, Jens Emil Grønbæk, Wen Ying, Ruofei Du, and Seongkook Heo
How can AI benefit a broader inclusive community?
ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users
ACM CHI 2012 · Dhruv Jain, Khoa Nguyen, Steven Goodman, Rachel Grossman-Kahn, Hung Ngo, Aditya Kusupati, Ruofei Du, Alex Olwal, Leah Findlater, and Jon Froehlich
How can AI + Metaverse improve our life?
SketchyScene: Richly-Annotated Scene Sketches
Changqing Zou, Qian Yu, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen, and Hao Zhang (ECCV 2022)
Language-based Colorization of Scene Sketches
Changqing Zou, Haoran Mo, Chengying Gao, Ruofei Du, and Hongbo Fu (ACM Transaction on Graphics, SIGGRAPH Asia 2019)
Future Directions
The Ultimate XR Platform
292
Wearable Subtitles
Augmenting Spoken Communication with
Lightweight Eyewear for All-day Captioning
Future Directions
The Ultimate XR Platform
294
Future Directions
Fuses Past Events
295
Future Directions
With the present
296
Future Directions
And look into the future
297
Future Directions
Change the way we communicate in 3D and consume the information
298
Future Directions
Consume the information throughout the world
299
Interactive Graphics for a
Universally Accessible Metaverse
Thank you!
Interactive Graphics for a
Universally Accessible Metaverse
Interactive Perception & Graphics for a
Universally Accessible Metaverse
Interactive Graphics for a
Universally Accessible Metaverse
Interactive Graphics for a
Universally Accessible Metaverse
306
Kernel Foveated Rendering
Xiaoxu Meng, Ruofei Du, Matthias Zwicker and Amitabh Varshney
Augmentarium | UMIACS
University of Maryland, College Park
ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games 2018
307
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Application | Resolution | Frame rate | MPixels / sec |
Desktop game | 1920 x 1080 x 1 | 60 | 124 |
308
Application | Resolution | Frame rate | MPixels / sec |
Desktop game | 1920 x 1080 x 1 | 60 | 124 |
2018 VR (HTC Vive PRO) | 1440 x 1600 x 2 | 90 | 414 |
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
309
* Data from Siggraph Asia 2016, Prediction by Michael Abrash, October 2016
Application | Resolution | Frame rate | MPixels / sec |
Desktop game | 1920 x 1080 x 1 | 60 | 124 |
2018 VR (HTC Vive PRO) | 1440 x 1600 x 2 | 90 | 414 |
2020 VR * | 4000 x 4000 x 2 | 90 | 2,880 |
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
310
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
311
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
fovea:
the center of the retina
corresponds to the center of the vision field
312
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
foveal region:
the human eye detects significant detail
peripheral region:
the human eye detects little high fidelity detail
313
foveal
region
foveal region
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
foveal region:
the human eye detects significant detail
peripheral region:
the human eye detects little high fidelity detail
314
96 %
27 %
Percentage of the foveal pixels
4 %
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
* Data from Siggraph 2017, by Anjul Patney, August 2017
315
316
Foveated Rendering
317
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
318
Related Work
319
Full Resolution
Multi-Pass Foveated Rendering [Guenter et al. 2012]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
320
Rasterizer
Early Z
Generate Coarse Quad
Shade
Evaluate Coarse Pixel Size
Input primitives
Coarse Pixel Shading (CPS) [Vaidyanathan et al. 2014]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
321
CPS with TAA & Contrast Preservation [Patney et al. 2016]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
322
Can we change the resolution gradually?
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
323
Perceptual Foveated Rendering [Stengel et al. 2016]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
324
Is there a foveated rendering approach
without
the expensive pixel interpolation?
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
325
Log-polar mapping [Araujo and Dias 1996]
Log-polar Mapping
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
326
Log-polar mapping [Araujo and Dias 1996]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Log-polar Mapping
327
Log-polar mapping [Araujo and Dias 1996]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Log-polar Mapping
328
Log-polar mapping [Araujo and Dias 1996]
Log-polar Mapping
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
329
Log-polar mapping [Araujo and Dias 1996]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Log-polar Mapping
330
Log-polar mapping [Araujo and Dias 1996]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Log-polar Mapping
331
Log-polar mapping [Araujo and Dias 1996]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Log-polar Mapping
332
Log-polar Mapping for 2D Image [Antonelli et al. 2015]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
333
Log-polar Mapping for 2D Image
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
334
Our Approach
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
335
Kernel Log-polar Mapping
range: [0,1]
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
336
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Log-polar Mapping
337
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
Kernel Log-polar Mapping
Kernel Foveated Rendering
338
339
Kernel log-polar Mapping
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
340
Kernel log-polar Mapping
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
341
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
342
Original Frame
Buffer
Screen
Sample Map
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
343
Original Frame
Buffer
Screen
Sample Map
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
344
Original Frame
Buffer
Screen
Sample Map
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
345
Fovea
Fovea
Fovea
346
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
347
Original Frame
Buffer
Screen
Sample Map
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
348
Original Frame
Buffer
Screen
Sample Map
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
349
Original Frame
Buffer
Screen
Sample Map
Introduction
Related Work
Our Approach
User Study
Experiments
Conclusion
350
Fovea
Fovea
Fovea
Eye-dominance-guided�Foveated Rendering
Xiaoxu Meng, Ruofei Du, and Amitabh Varshney
IEEE Transactions on Visualization and Computer Graphics (TVCG)
355
356
Ocular Dominance: the tendency to prefer scene perception from one eye over the other.
Advantage of the Dominant Eye Over the Non-dominant Eye
357
358
Application | Resolution | Frame rate | MPixels / sec |
Desktop game | 1920 x 1080 x 1 | 60 | 124 |
2018 VR (HTC Vive PRO) | 1440 x 1600 x 2 | 90 | 414 |
2020 VR (Varjo) | 1920 x 1080 x 2 + 1440 x 1600 x 2 | 90 | 788 |
Foveated Rendering
359
Foveated Rendering
360
96 %
27 %
Percentage of the foveal pixels
4 %
* Data from Siggraph 2017, by Anjul Patney, August 2017
fovea
fovea
Can we do better?
fovea
fovea
non-dominant eye
fovea
fovea
more foveation for the non-dominant eye
A Log-Rectilinear Transformation for Foveated 360-Degree Video Streaming
David Li†, Ruofei Du‡, Adharsh Babu†, Camelia Brumar†, Amitabh Varshney†
† University of Maryland, College Park ‡ Google
UMIACS
TVCG Honorable Mentions Award
Introduction
VR headset & video streaming
367
Introduction
VR + eye tracking
368
HTC Vive Eye
Varjo VR-3
Fove
Introduction
360 videos
369
360° Field of Regard
Scene
Captured 360 Video
Introduction
360 videos
370
Captured 360 Video
Projection to Field of View
Introduction
360 videos
371
Tiling Illustration
Image from (Liu et al. with Prof. Bo, 2017)
Introduction
Foveated rendering
372
Image Credit: Tobii
Log-polar Transformation,
Image from (Meng et al., 2018)
Introduction
Log-Polar Foveated Streaming
373
374
1
Research Question
Can foveation techniques from rendering be used to optimize 360 video streaming?
375
2
Research Question
How can we reduce foveation artifacts by leveraging the full original video frame?
Log-Polar Foveated Streaming
Original Frame
Subsampled Pixel
Log-Polar Foveated Streaming
Original Frame
Subsampled Pixel
Log-Polar Foveated Streaming
Summed-Area Tables
Log-Rectilinear Transformation
Foveated Streaming
Decoding 360° Video
GPU-driven Summed-Area Table Generation
Computing the Log-Rectilinear Buffer
Encoding the Log-Rectilinear Video Stream
Updating the Foveal Position
Decoding the
Log-Rectilinear Video Stream
Transforming into a Full-resolution Video Frame
Video Streaming Server
Client
socket
socket
FFmpeg
OpenCL
OpenCL
FFmpeg
FFmpeg
OpenCL
Video Streaming Request
socket
Qualitative Results
Quantitative Results
We perform quantitative evaluations comparing the log-rectilinear transformation and the log-polar transformation in 360° video streaming.
Quantitative Results
Quantitative Results
Conclusion
Foveation
Summed-Area Tables
Standard Video Codecs
Foveated 360° Video Streaming
Zhenyi He* Ruofei Du† Ken Perlin*
*Future Reality Lab, New York University †Google LLC
CollaboVR: A Reconfigurable Framework for
Creative Collaboration in Virtual Reality
The best layout and interaction mode?
Research Questions:
CollaboVR: A Reconfigurable Framework for
Creative Collaboration in Virtual Reality
CollaboVR
Chalktalk (Cloud App)
Audio Communication
Layout Reconfiguration
Layout Reconfiguration
User Arrangements
(1) side-by-side
(2) face-to-face
(3) hybrid
Input Modes
(1) direct
(2) projection
Layout Reconfiguration
User Arrangements
(1) side-by-side
(b)
user 1
Interactive boards
tracking range of user 1
user 1
user 2
Layout Reconfiguration
User Arrangements
(1) side-by-side
(2) face-to-face
(b)
(c)
user 1
user 2
(b)
user 2
observed by user 1
A
user 1
RH
LH
RH
LH
Layout Reconfiguration
User Arrangements
(1) side-by-side
(2) face-to-face
(d)
(c)
(b)
user 1
user 2
user 3
user 4
Layout Reconfiguration
User Arrangements
(1) side-by-side
(2) face-to-face
(3) hybrid
(d)
(c)
(b)
user 2
teacher
user 3
user 4
Layout Reconfiguration
Input Modes
(1) direct
(2) projection
Layout Reconfiguration
Input Modes
(1) direct
(2) projection
Layout Reconfiguration
Input Modes
(1) direct
(2) projection
C1: Integrated Layout
C2: Mirrored Layout
C3: Projective Layout
C1: Integrated Layout
C2: Mirrored Layout
C3: Projective Layout
C1: Integrated Layout
C2: Mirrored Layout
C3: Projective Layout
C1: Integrated Layout
C2: Mirrored Layout
C3: Projective Layout
Evaluation
Overview of subjective feedback on CollaboVR
Evaluation
Evaluation
Takeaways
more live demos...
Zhenyi He* Ruofei Du† Ken Perlin*
*Future Reality Lab, New York University †Google LLC
CollaboVR: A Reconfigurable Framework for
Creative Collaboration in Virtual Reality
Fusing Physical and Virtual Worlds into
An Interactive Metaverse
Introduction
Depth Map
Introduction
Depth Map
Introduction
Depth Lab
Thank you!
www.duruofei.com
Introduction
Depth Lab
Occlusion is a critical component for AR realism!
Correct occlusion helps ground content in reality, and makes virtual objects feel as if they are actually in your space.
Introduction
Motivation
Depth Mesh
Generation
Localized Depth
Avatar Path Planning
Dense Depth
Depth Texture
Introduction
Depth Map
Taxonomy
Depth Usage
Introduction
Depth Map
Introduction
Depth Map
OmniSyn: Synthesizing 360 Videos with Wide-baseline Panoramas
David Li, Yinda Zhang, Christian Häne, Danhang Tang, Amitabh Varshney, Ruofei Du
433
Problem
434
≥ 5 meters
baseline
OmniSyn
360° Wide-baseline
View Synthesis
Related Works - Monocular Neural Image Based Rendering with Continuous View Control (ICCV 2019)
435
Related Works - SynSin (CVPR 2020)
436
Research Goal
437
Method
438
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
… | … | | | | | | … |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
CoordConv
Spherical Cost Volume
Pipeline
439
Panorama 1
Panorama 0
Depth Prediction 0
Depth Prediction 1
RGB + Visibility
RGB + Visibility
Depth Predictor
Depth Predictor
[R0|t0]
Mesh Renderer
Fusion
Network
[R1|t1]
Mesh Renderer
Target Panorama
Stereo Depth with Cost Volume
440
-
=
-
=
…
Stereo 360 Depth with Cost Volume
441
Stereo 360 Depth with Cost Volume
442
-
=
-
=
…
Mesh Rendering
443
Mesh Rendering
444
Point Cloud Render
Mesh Render
3 m
2 m
1 m
4 m
OmniSyn (Mesh)
GT Visibility
OmniSyn (Point Cloud)
CoordConv and Circular CNN
445
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
… | … | | | | | | … |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
CoordConv
Circular CNN
Experiments
446
Results
447
Generalization to Real Street View Panoramas
448
0 m GT
4.6 m GT
10.1 m GT
0 m GT
9.0 m GT
0 m GT
Synthesized
Limitations
449
Input 0
Input 1
Synthesized
Fusion network does not generalize well to unseen colors.
Depth prediction struggles with tall buildings.
Triangle removal may eliminate thin structures.
Conclusion
450