1 of 44

Identifying Routes in the NFL

Dani Chu

M.Sc. Student - Statistics, Simon Fraser University

Graduate Intern - Basketball Strategy & Analytics, NBA

1

2 of 44

Collaborators

Lucas Wu

PhD Student

Simon Fraser University

yifan_wu@sfu.ca

@lucaswu123

James Thomson

MSc Student

Simon Fraser University

james_thomson_2@sfu.ca

@stat_jamest

Dani Chu

MSc Student

Simon Fraser University

dani_chu@sfu.ca

@chuurveg

Matthew Reyers

MSc Student

Simon Fraser University

matthew_reyers@sfu.ca

@Stats_By_Matt

2

3 of 44

NFL Big Data Bowl

3

4 of 44

NFL Big Data Bowl

Inaugural Event designed to:

Generate new football ideas
Promote public analytic talent

4

5 of 44

The Data

NFL player tracking data from the first 6 weeks of the 2017 regular season

10 observations per second
(x, y) location of each player and the ball
6,963 passing plays with 34,698 routes

5

6 of 44

Example - Eligible Receivers Only

6

7 of 44

Example - Eligible Receivers Only

7

8 of 44

Receiver Performance

8

9 of 44

Receiving Statistics

Result: Completions
Result: Receiving Yards
Process: Targets
Process: Air Yards

How can we account for usage?

9

10 of 44

Route Identification

10

11 of 44

Why is it Useful?

Build Player Route Trees
Enhance Receiver and QB statistics
Tag Film

11

12 of 44

In the NBA

12

13 of 44

What is a Route?

13

*Star Tribune

14 of 44

Example - Eligible Receivers Only

14

15 of 44

Functional Data

15

16 of 44

Functional Data

16

17 of 44

Functional Data

17

18 of 44

Bezier Curves

18

19 of 44

Bezier Curve

Defined on with control points

Where

In 2 dimensions and degree 9

19

20 of 44

Curve Clustering

20

21 of 44

Route Tree Reminder

21

*Star Tribune

22 of 44

Pre-processing: Data

22

23 of 44

Curve Clustering

Assume

There are K route types
Player trajectories are realizations of one of the K routes
Route types are specified by a Bezier Curve

Then

Use the EM algorithm to

Learn the template curves
Label probabilities for each observed trajectory

23

24 of 44

Expectation Step

In our case:

Observations are 2 dimensional curves of different lengths

Parameters are control points of bezier curve and variance parameter

For z_jk = 1 when observation j is from cluster k

Update the expected cluster membership for each observation

24

25 of 44

Maximization Step

Update parameters for cluster k through

Weighted Least Squares
No intercept term

25

26 of 44

Cluster Means

26

27 of 44

Route Labelling

27

28 of 44

Cluster 20 - Post

28

29 of 44

WRs with most Routes per Team

29

30 of 44

Full Process

30

31 of 44

Route Profiles

31

32 of 44

Position Comparisons

32

33 of 44

WR Usage

33

34 of 44

Air Yards Over Expectation

34

35 of 44

Air Yards Over Expectation (>100 Routes)

35

36 of 44

Who’s the Best at Running Routes?

36

37 of 44

Play Database

37

38 of 44

Play Database

Can identify play description, results and other information based on:

Team
Players
Routes Run
Location

For example we can pull film for:

Out route and a Go route
Michael Thomas and Alvin Kamara lined up on the same side of the field
DeAndre Hopkins running a comeback

38

39 of 44

Thank You!

Any Questions?

Dani Chu

dani_chu@sfu.ca

@chuurveg

Matthew Reyers

matthew_reyers@sfu.ca

@Stats_By_Matt

James Thomson

james_thomson_2@sfu.ca

@stat_jamest

Lucas Wu

yifan_wu@sfu.ca

@lucaswu123

Email:

Twitter:

Email:

Twitter:

39

40 of 44

40

41 of 44

RB Usage

41

42 of 44

Targets Over Expectation

42

43 of 44

Targets Over Expectation (> 100 Routes)

43

44 of 44

Who’s the Best at Running Routes?

44