JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 10

Using Vector Embeddings to Predict Bike Races

2 of 10

Outline

What are vector embeddings?
Application to road cycling

3 of 10

What are vector embeddings?

4 of 10

Intro to Vector Embeddings

Sometimes deal with non-numeric data

E.g. Words, movies, songs

How to represent these so we can model them?

Need to convert to numbers

Solution: Give each item a vector of k numbers
Similar vectors imply similar items

Similar = high dot-product

5 of 10

Example 1: Word2Vec

Q: How to use text in model?
A: convert words to vectors
How to define similarity?
Make probability of two words appearing together in a sentence proportional to their dot product

6 of 10

Example 2: Netflix

Q: Which movie should I recommend to a user?
A: convert movies and users into vectors
Make probability of a user watching a certain movie proportional to their dot-product

7 of 10

Example 3: CLIP

Q: How to connect images and their captions?
A: convert images and captions into vectors?
Make real image-caption pairs have high dot-product

8 of 10

Vector embeddings for cycling race prediction

9 of 10

Predicting Cycling Races

Different races suit different riders

E.g. flat vs mountainous terrain suit different riders’ strengths

Current literature hard-codes different features from races that are known to be similar

Problem: narrow scope, hard to scale

10 of 10

My Idea

What if each rider and each race had their own embedding?
If rider does well at race, they have high dot-product
Scraped historical race results from CQRanking.com
Minimize MSE between dot-product and number of ranking points earned by riders in races