1 of 30

Expected Goals in Hockey: A Review (2004-2020)

Josh & Luke Younggren

Evolving-Hockey.com

@EvolvingWild

2 of 30

xG Review | CBJHAC '20

2

Overview

3 of 30

xG Review | CBJHAC '20

3

Expected Goals: What is it?

Is Goal

Shot Distance

Shot Angle

Rush

Rebound

1

20

20

0

0

0

30

0

1

0

0

10

5

0

1

1

5

14

0

1

1

60

8

0

0

0

3

90

0

0

0

10

3

1

0

0

9

7

1

0

0

22

2

0

1

0

220

29

0

0

0

9

3

0

1

1

3

32

1

1

What:

  • Binary Classification problem
  • Predict the probability that a shot will become a goal
  • NHL’s RTSS (play-by-play) data

Use:

  • Player & Team evaluation
  • Predictive ability
  • Skater defense (remove the goalie)
  • Shooting ability
  • Bedrock for numerous systems for player evaluation

4 of 30

xG Review | CBJHAC '20

4

A Brief History

Shot Quality Models

  • 2004: Alan Ryder (Shot Quality)
  • 2009-10: Ken Krzywicki (Observer bias, Shot angles and rebounds)
  • 2011: Michael Schuckers (Goalie evaluation, Shot Quality)

Focus:

  • Historic goal expectancy
  • Evaluation of Shooting % from locations
  • RTSS bias / adjustments
  • Early team and player evaluation

Expected Goals

  • 2012: Brian Macdonald (Expected Goals)
  • 2015: Dawson Sprigings & Asmae Toumi
  • 2016: Emmanuel Perry (Corsica)
  • 2016: Peter Tanner (Money Puck)
  • 2016+: Cole Anderson, Matt Barlowe
  • 2018: Harry Shomer
  • 2018: EvolvingWild

Focus:

  • Statistical Modeling
  • Machine learning / Optimization
  • Feature selection
  • Shooting/Goalie work
  • Team/Player evaluation

5 of 30

xG Review | CBJHAC '20

5

The Data

NHL’s Real Time Scoring System (RTSS) Data

  • All data acquired using the Evolving-Hockey scraper: github.com/evolvingwild/evolving-hockey
  • Fenwick Shots (Goals, Missed Shots, Saved Shots)
    • Shot location not available for blocked shots
  • Other events:
    • Blocked Shots, Takeaways, Giveaways, Faceoffs, Hits
  • Shot attributes:
    • Shot Type (Wrist, Slap, Snap, Wrap-Around, Tip-In, Deflection)

Cleaning/Modifying the Data

  • Shots without coordinates removed
  • Distance (feet from center of goal line) / Angle (degrees perpendicular to center Y line)
    • Assume all shots taken below center line
    • Recalculate if recorded zone is the defensive zone

6 of 30

xG Review | CBJHAC '20

6

Available Predictor Variables

From RTSS:

  • Shot distance
  • Shot angle
  • Shot type (deflected/tip/slap/backhand/snap/wrap/wrist)
  • Score state (score down / up around 0)
  • Strength state (if applicable)
  • Game time (period / seconds)
  • Home/Away team
  • Seconds since last event
  • Distance from last event
  • Prior event same team (all event types)
  • Prior event opposing team (all event types)
  • Coordinates (if applicable)
  • Prior coordinates (if applicable)

Created:

  • Rebound
    • Any shot that occurred 2 seconds after a prior shot
  • Rush
    • Any shot that occurred 6 seconds after either:
      • any event in the defensive zone or
      • any giveaway or takeaway

Is Goal

Shot Distance

Shot Angle

Rush

Rebound

1

20

20

0

0

0

30

0

1

0

0

10

5

0

1

1

5

14

0

1

1

60

8

0

0

0

3

90

0

0

0

10

3

1

0

0

9

7

1

0

0

22

2

0

1

0

220

29

0

0

0

9

3

0

1

1

3

32

1

1

7 of 30

xG Review | CBJHAC '20

7

Modeling

8 of 30

xG Review | CBJHAC '20

8

The Models

Baseline (Model 0)

  • Logistic regression
  • Shot distance only

Alan Ryder (Model 1)

  • Logistic regression
  • Shot distance
  • Score state (down 3 through up 3)
  • Game period / Game Seconds
  • Shot type

glm Base (Model 2)

  • Logistic regression
  • [Prior features]
  • Polynomial transformation for distance and angle
  • Rush
  • Rebound

glmnet (Model 3)

  • Regularized logistic regression
  • [Prior features]
  • Prior events
  • Prior teams

XGBoost (Model 4)

  • Gradient boosting model
  • [Prior features]
  • Removed rush/rebound features
  • Coordinates
  • Prior coordinates

9 of 30

xG Review | CBJHAC '20

9

Evaluation

Timeframe

  • 3 seasons, regular season only
    • ‘15-16 / ’16-17 / ‘17-18
  • 5v5
  • 247,373 total Fenwick shots

Metrics

  • Area Under Curve (ROC)
    • Higher, better (0 - 1)
  • Log loss
    • Lower, better (0 - ∞)

Methods:

  • In-sample Cross Validation
  • Future seasons (‘18-19 / ‘19-20)

10 of 30

xG Review | CBJHAC '20

10

Shot Distance

AUC: 0.7294 | LL: 0.1954

Ryder Model

AUC: 0.7409 | LL: 0.1943

11 of 30

xG Review | CBJHAC '20

11

glm Base

AUC: 0.7669 | LL: 0.1878

glmnet

AUC: 0.7675 | LL: 0.1882

12 of 30

xG Review | CBJHAC '20

12

XGBoost 3 year

AUC: 0.778 | LL: 0.1847

Model

AUC

LL

Shot Distance

0.7294

0.1954

Ryder

0.7409

0.1943

glm Base

0.767

0.1878

glmnet

0.7675

0.1882

XGBoost 3

0.778

0.1847

CV Model Comparison

Model

AUC

LL

Shot Distance

0.7319

0.2047

Ryder

0.7441

0.2028

glm Base

0.7656

0.1979

glmnet

0.7654

0.1983

XGBoost 3

0.7764

0.1947

Future Season Comparison (‘18-20)

13 of 30

xG Review | CBJHAC '20

13

14 of 30

xG Review | CBJHAC '20

14

Comparison

15 of 30

xG Review | CBJHAC '20

15

Correlation Matrix: 5v5 xG Comparison, ‘18-20

16 of 30

xG Review | CBJHAC '20

16

Correlation Matrix - Skater On-Ice Measure:

xGF per 60 / xGA per 60, ‘18-20, Min. 200 Minutes

17 of 30

xG Review | CBJHAC '20

17

Correlation Matrix - Skater On-Ice Measure:

xGF per 100FF / xGA per 100FA, ‘18-20, Min. 200 Minutes

18 of 30

xG Review | CBJHAC '20

18

Team xGF%: ‘18-19, 5v5

19 of 30

xG Review | CBJHAC '20

19

Team xG +/- per Fenwick Shot: ‘18-19, 5v5

20 of 30

xG Review | CBJHAC '20

20

Magnitude: All Models

21 of 30

xG Review | CBJHAC '20

21

Calibration and Heatmaps

22 of 30

xG Review | CBJHAC '20

22

23 of 30

xG Review | CBJHAC '20

23

Models 0-1: Shot Distance / Ryder

24 of 30

xG Review | CBJHAC '20

24

Models 2-3: glm base / glmnet

25 of 30

xG Review | CBJHAC '20

25

Models 4+: XGBoost

26 of 30

xG Review | CBJHAC '20

26

Wrapping Up

27 of 30

xG Review | CBJHAC '20

27

Biases

1. Seconds since

2. Distance / Rink bias

3. Shots from behind the goal line

Goals

68

Fenwick

825

% of All

0.59%

0: Distance

79.06

1: Ryder

69.30

2: glm Base

48.45

3: glmnet

40.86

4: XGB 3

63.32

4+ xG XGB 9

89.01

28 of 30

xG Review | CBJHAC '20

28

Takeaways

Objectives

  • What is the purpose of your model?
    • Interpretability / performance
  • Predictive vs. Inferential
  • Strength states

Data

  • Timeframe
  • Rolling / Multiple xG models
  • Have you identified all biases?

Time / Performance Tradeoff

  • Do you have a month?
  • More complicated model is generally ”better”
  • Less complicated model is still pretty good

Future

  • Rink bias adjustment
  • Player tracking

29 of 30

xG Review | CBJHAC '20

29

References

  • “Shot Quality” - Alan Ryder (2004)

http://hockeyanalytics.com/Research_files/Shot_Quality.pdf

  • “Removing Observer Bias from Shot Distance” - Ken Krzywicki (2009)

http://hockeyanalytics.com/Research_files/SQ-DistAdj-RS0809-Krzywicki.pdf

  • “NHL Shot Quality 2009-10” - Ken Krzywicki (2010)

http://hockeyanalytics.com/Research_files/SQ-RS0910-Krzywicki.pdf

  • “DIGR: A Defense Independent Rating of NHL Goaltenders using Spatially Smoothed Save Percentage Maps - Michael Schuckers (2011)

http://www.hockeyanalytics.com/Research_files/DIGR_Schuckers.pdf

  • “A Look at Shot Quality” – Michael Schuckers (2011)

https://www.arcticicehockey.com/2011/6/15/2224920/a-look-at-shot-quality

  • “An Expected Goals Model for Evaluating NHL Teams and Players” – Brian Macdonald (2012)

http://www.sloansportsconference.com/wp-content/uploads/2012/02/NHL-Expected-Goals-Brian-Macdonald.pdf

  • “Expected Goals are a better predictor of future scoring than Corsi, Goal” – Dawson Sprigings & Asmae Toumi (2015)

https://hockey-graphs.com/2015/10/01/expected-goals-are-a-better-predictor-of-future-scoring-than-corsi-goals/

  • “About” – Moneypuck/Peter Tanner

http://moneypuck.com/about.htm

  • “NHL Expected Goals Model” – Matt Barlowe (2017)

https://rstudio-pubs-static.s3.amazonaws.com/311470_f6e88d4842da46e9941cc6547405a051.html

  • “Expected Goals (xG), Uncertainty, and Bayesian Goalies” – Cole Anderson (2017)

http://www.crowdscoutsports.com/game-theory/expected-goal-xg-model/

  • “Expected Goals Model” – Harry Shomer (2018)

http://fooledbygrittiness.blogspot.com/2018/01/expected-goals-model.html

  • “Evaluating my Shooter xG model” – Harry Shomer (2018)

http://fooledbygrittiness.blogspot.com/2018/03/evaluating-my-shooter-xg-model.html

  • “A New Expected Goal Model for Predicting Goals in the NHL” – Josh & Luke Younggren (2018)

https://rpubs.com/evolvingwild/395136/

30 of 30

Thank You!