1 of 44

Identifying Routes in the NFL

Dani Chu

M.Sc. Student - Statistics, Simon Fraser University

Graduate Intern - Basketball Strategy & Analytics, NBA

1

2 of 44

Collaborators

Lucas Wu

PhD Student

Simon Fraser University

yifan_wu@sfu.ca

@lucaswu123

James Thomson

MSc Student

Simon Fraser University

james_thomson_2@sfu.ca

@stat_jamest

Dani Chu

MSc Student

Simon Fraser University

dani_chu@sfu.ca

@chuurveg

Matthew Reyers

MSc Student

Simon Fraser University

matthew_reyers@sfu.ca

@Stats_By_Matt

2

3 of 44

NFL Big Data Bowl

3

4 of 44

NFL Big Data Bowl

  • Inaugural Event designed to:
    1. Generate new football ideas
    2. Promote public analytic talent

4

5 of 44

The Data

  • NFL player tracking data from the first 6 weeks of the 2017 regular season
    • 10 observations per second
    • (x, y) location of each player and the ball
    • 6,963 passing plays with 34,698 routes

5

6 of 44

Example - Eligible Receivers Only

6

7 of 44

Example - Eligible Receivers Only

7

8 of 44

Receiver Performance

8

9 of 44

Receiving Statistics

  • Result: Completions
  • Result: Receiving Yards
  • Process: Targets
  • Process: Air Yards

How can we account for usage?

9

10 of 44

Route Identification

10

11 of 44

Why is it Useful?

  • Build Player Route Trees
  • Enhance Receiver and QB statistics
  • Tag Film

11

12 of 44

In the NBA

12

13 of 44

What is a Route?

13

*Star Tribune

14 of 44

Example - Eligible Receivers Only

14

15 of 44

Functional Data

15

16 of 44

Functional Data

16

17 of 44

Functional Data

17

18 of 44

Bezier Curves

18

19 of 44

Bezier Curve

Defined on with control points

Where

In 2 dimensions and degree 9

19

20 of 44

Curve Clustering

20

21 of 44

Route Tree Reminder

21

*Star Tribune

22 of 44

Pre-processing: Data

22

23 of 44

Curve Clustering

Assume

  • There are K route types
  • Player trajectories are realizations of one of the K routes
  • Route types are specified by a Bezier Curve

Then

  • Use the EM algorithm to
    • Learn the template curves
    • Label probabilities for each observed trajectory

23

24 of 44

Expectation Step

In our case:

Observations are 2 dimensional curves of different lengths

Parameters are control points of bezier curve and variance parameter

For zjk = 1 when observation j is from cluster k

Update the expected cluster membership for each observation

24

25 of 44

Maximization Step

Update parameters for cluster k through

  • Weighted Least Squares
  • No intercept term

25

26 of 44

Cluster Means

26

27 of 44

Route Labelling

27

28 of 44

Cluster 20 - Post

28

29 of 44

WRs with most Routes per Team

29

30 of 44

Full Process

30

31 of 44

Route Profiles

31

32 of 44

Position Comparisons

32

33 of 44

WR Usage

33

34 of 44

Air Yards Over Expectation

34

35 of 44

Air Yards Over Expectation (>100 Routes)

35

36 of 44

Who’s the Best at Running Routes?

36

37 of 44

Play Database

37

38 of 44

Play Database

Can identify play description, results and other information based on:

  • Team
  • Players
  • Routes Run
  • Location

For example we can pull film for:

  • Out route and a Go route
  • Michael Thomas and Alvin Kamara lined up on the same side of the field
  • DeAndre Hopkins running a comeback

38

39 of 44

Thank You!

Any Questions?

Matthew Reyers

matthew_reyers@sfu.ca

@Stats_By_Matt

Email:

Twitter:

Email:

Twitter:

39

40 of 44

40

41 of 44

RB Usage

41

42 of 44

Targets Over Expectation

42

43 of 44

Targets Over Expectation (> 100 Routes)

43

44 of 44

Who’s the Best at Running Routes?

44