2 of 7

The Challenge: Understanding High-Dimensional Data

Machine learning models generate embeddings in hundreds or thousands of dimensions. Traditional visualization flattens these into 2D plots using PCA, t-SNE, or UMAP, which loses critical spatial relationships and structural information that exists in the original high-dimensional space.

Traditional 2D Projection

Why This Matters

• Cluster boundaries are ambiguous in 2D

• Distance relationships are distorted

• Outliers may appear grouped or vice versa

• Cannot walk around data to see structure from different angles

• Hard to build intuition about the embedding space

3 of 7

The Solution: Passthrough AR for 3D Embeddings

Passthrough AR provides metric grounding to physical space. Users can walk through 3D embeddings, use their body as a reference frame, and interrogate spatial structure naturally — improving understanding over 2D plots.

Current 3D Embedding Visualization

AR Advantages Over 2D

Physical Grounding

Room scale provides natural distance metrics

Natural Navigation

Walk, turn, crouch to explore from any angle

Spatial Memory

Physical location aids recall of data structure

True 3D Perception

Depth perception reveals structure lost in 2D

4 of 7

Schedule and Milestones

2/10-2/19: Generate Real ML Embeddings

Use a real ML model (CNN for image embeddings or transformer for sentence embeddings) to generate high-dimensional vectors from public datasets. Validate embedding quality and select representative samples.

2/19-3/03: Project & Render in Passthrough AR

Apply dimensionality reduction (PCA/UMAP/custom projection) to map embeddings to 3D coordinates. Render in Unity with Meta XR SDK passthrough, allowing users to see embeddings overlaid on their physical environment.

3/03-3/05: Comparative User Study

Run a controlled study comparing 2D plots vs passthrough AR on tasks like cluster identification and similarity judgment. Measure accuracy, time, and subjective understanding.

5 of 7

In-Class Activity

Comparative Visualization Experiment

Students will attempt to identify clusters in the same embedding dataset using two different visualization techniques, then compare their accuracy and experience.

2D Scatter Plot

Traditional screen-based visualization

Passthrough AR

Embeddings in physical space

Measured Outcomes

• Cluster identification accuracy

• Time to complete task

• Subjective preference ratings

• Understanding of spatial relationships

6 of 7

Wiki Contributions & Deliverables

Wiki Location: VR Visualization Software → Scientific Visualization → ML Embeddings in AR

Tutorial Document

• Step-by-step guide for visualizing ML embeddings in passthrough AR

• Unity setup with Meta XR SDK integration

• Code examples for embedding generation and 3D projection

Comparative Analysis Table

• Quantitative results: accuracy, completion time, user ratings

• Qualitative insights on spatial understanding

• Recommendations for when each method is most effective

Design Guidelines Page

• When AR helps vs when dimensionality overwhelms

• Best practices for embedding visualization in physical space

• Limitations and future directions

7 of 7

Technology Stack

PyTorch

ML Framework

Generate embeddings from pre-trained models (ResNet, BERT, etc.)

NumPy / Scikit-learn

Data Processing

Data processing and dimensionality reduction (PCA, UMAP)

Unity

AR Platform

3D rendering engine for AR visualization

Meta XR SDK

AR SDK