1 of 32

2 of 32

Table of Contents

  1. Background
  2. Variable Selection
  3. Model Selection
  4. Model Results
  5. Batter BABIP Changes
  6. Pitcher BABIP Changes
  7. Model Analysis
  8. Risk Assessment
  9. Future Model Improvements

3 of 32

xBABIP Background

  • Modeling of xBABIP goes back to FanGraphs in 2008
    • Models evolve with introduction of new statistics
    • Batted ball profiles
    • Bat impact (Hard%)
    • Speed
    • Statcast launch speed and launch angle

4 of 32

Variable Choices

  • Data from Baseball Savant and FanGraphs
    • Statcast
    • Shifts
    • Situational
  • Weather Data
  • Biographical Data
  • Calculated Fields
    • Batted ball direction
    • Hang time
    • Home park factor

Batted Ball Direction

5 of 32

Variable Choices

  • Effect of Closed Stadiums - X
  • Batted Ball Type - X
    • Groundball, Line Drive, etc.
  • Batted Ball Directional - X
    • Pull%, Mid%, Oppo%

Hang Time

6 of 32

Model Ideas

  • Batter model
  • Pitcher model comprised of four main pitch types
    • Weighted average of expected pitch type BABIPs back into each pitcher

7 of 32

Model Selection

  • Standardized variables
    • Scales variables to account for multicollinearity and unit discrepancies
  • Ran a probit regression
    • Probability of a one or zero event (hit or out)
    • Pitch level
  • Ran 2015 and 2016 data to test sample
    • Batters require about 800 BIP to stabilize
    • Pitchers require about 2000 BIP to stabilize

8 of 32

Model Application

  • Testing 2015-16 model against 2017 data

Batters

Pitchers

9 of 32

Model Application

  • Grouped pitches by batter season
    • Standardized variables
    • Applied model to grouped statistics
  • Grouped pitches by pitcher season pitch type
    • Standardized variables
    • Applied model to grouped statistics

10 of 32

Park Factors (Batters)

  1. Radar Inconsistencies
  2. Park Dimensions
  3. Climate

Park Factor T-Stats

11 of 32

Batter Results (T-Stats)

Handed

Matchup

Runner on First

Runner on Second

Runner on Third

Batted Ball Direction

Batted Ball Direction2

Launch Speed

Launch Angle

5.11*

0.46

4.67*

-5.90*

0.05

-6.80*

0.38

146.22*

19.37*

Hang Time

Sprint Speed

Age

Shift

Temp

Humidity

Wind

Speed

Air Density

Constant

-35.47*

9.91*

-7.91*

-2.36*

-4.69*

0.04*

1.59

-5.57*

58.95*

*Significant at the 0.05 level

12 of 32

Importance of Launch Speed and Angle

Variable T-Statistics

Launch speed: 146.22

Launch angle: 19.37

13 of 32

Pitcher Results (T-Stats)

Handed

Matchup

Runner on First

Runner on Second

Runner on Third

Velo

Batted Ball Direction

Batted Ball Direction2

Launch Speed

Launch Angle

Hang Time

FA

0.08

-3.39*

4.28*

-5.18*

1.13

-1.00

-4.6*

0.62

117.72*

15.7*

-28.96*

CH

-0.17

-0.19

1.93

-1.08

0.20

-1.66

-2.51*

-0.15

48.54*

3.89*

-8.54*

CU

-0.26

0.79

2.08*

-2.03*

0.16

-0.77

-2.61*

-0.91

39.91*

4.41*

-6.99*

SL

0.68

-1.99*

2.00*

-2.48*

-0.69

-1.24

-2.33*

1.53

52.24*

6.47*

-13.57*

*Significant at the 0.05 level

14 of 32

Pitcher Results (T-Stats)

X Movement

Z movement

Spin

Rate

Age

Shift

GB%

Temp

Humidity

Wind

Speed

Air Density

Constant

FA

0.10

-0.9

0.28

-0.31

-0.29

-6.09*

-2.91*

-0.61

0.55

-3.44*

50.11*

CH

0.03

-1.92

1.50

-0.74

-2.08*

-2.22*

-1.89*

1.22

2.13*

-2.74*

18.41*

CU

0.11

-0.15

-0.24

-0.47

-0.61

-0.68

0.65

-1.18

0.50

-0.06

14.85*

SL

0.75

-0.25

1.29

-1.05

0.09

-0.54

-3.54*

-1.53

-0.25

-4.09*

22.13*

*Significant at the 0.05 level

15 of 32

Aging Curve

Batters

Pitchers

16 of 32

5 Batters Expected to Positively Regress

  1. Joc Pederson (Actual .241 Predicted .336)
  2. Maikel Franco (Actual .234 Predicted .326)
  3. Scott Schebler (Actual .248 Predicted .336)
  4. Jose Osuna (Actual .254 Predicted .337)
  5. Pablo Sandoval (Actual .240 Predicted .320)

17 of 32

Joc Pederson, 25

  • Actual BABIP of .241
  • Projected BABIP of .336 (+.95)
  • Launch speed of 90.1 mph
  • Launch angle of 10.2°
  • Left-Handed

18 of 32

Batters Positively Regressing

Maikel Franco

  • Actual .234
  • Predicted .326
  • Batted ball direction: -6.8°

Scott Schebler

  • Actual .248
  • Predicted .336
  • Sprint speed: 28.4 mph

Jose Osuna

  • Actual .254
  • Predicted .337
  • Shift rate: 5.9%

Pablo Sandoval

  • Actual .240
  • Predicted .320
  • Launch speed: 89.4 mph

19 of 32

5 Batters Expected to Negatively Regress

  1. Delino DeShields (Actual .358 Predicted .213)
  2. Ezequiel Carrera (Actual .358 Predicted .238)
  3. Dee Gordon (Actual .354 Predicted .260)
  4. Ender Inciarte (Actual .339 Predicted .250)
  5. Joey Rickard (Actual .303 Predicted .220)

20 of 32

Delino DeShields, 24

  • Actual BABIP of .358
  • Projected BABIP of .213 (- .145)
  • Launch speed of 79.2 mph
  • Hang time of 2.0 seconds
  • Shift rate of 22.6%

21 of 32

Batters Negatively Regressing

Ezequiel Carrera

  • Actual .358
  • Predicted .238
  • Launch speed: 81.9 mph

Dee Gordon

  • Actual .354
  • Predicted .260
  • Launch speed: 79.3°

Ender Inciarte

  • Actual .339
  • Predicted .250
  • Launch speed: 81.2 mph

Joey Rickard

  • Actual .303
  • Predicted .220
  • Hang time: 2.43 seconds

22 of 32

5 Pitchers Expected to Positively Regress

  1. Rafael Montero (Actual .366 Predictive .245)
  2. Jameson Taillon (Actual .352 Predictive .265)
  3. Anibal Sanchez (Actual .354 Predictive .271)
  4. Clayton Richard (Actual .351 Predictive .273)
  5. Daniel Norris (Actual .343 Predictive .273)

23 of 32

Rafael Montero, 26

  • Actual BABIP of .366
  • Projected BABIP of .245 (- .121)
  • Launch speed off fastball: 83.2 mph
  • Launch angle off fastball: 7.4°
  • 48.1% groundball rate
  • FA Rate: 55.3%

24 of 32

Pitchers Positively Regressing

Jameson Taillon

  • Actual .352
  • Predicted .265
  • CU Launch speed: 83.6 mph
  • CU Rate: 26.6%

Anibal Sanchez

  • Actual .354
  • Predicted .271
  • CH Hang time: 3.6 sec
  • CH Rate: 20.9%

Clayton Richard

  • Actual .351
  • Predicted .273
  • FA Launch angle: -1.30°
  • FA Rate: 71.57%

Daniel Norris

  • Actual .343
  • Predicted .273
  • FA Hang time: 2.84 sec
  • FA Rate: 55%

25 of 32

5 Pitchers Expected to Negatively Regress

  1. Jake Odorizzi (Actual .227 Predictive .296)
  2. Cole Hamels (Actual .251 Predictive .301)
  3. James Shields (Actual .270 Predictive .320)
  4. Mike Pelfrey (Actual .276 Predictive .317)
  5. Erasmo Ramirez (Actual .269 Predictive .310)

26 of 32

Jake Odorizzi, 27

  • Curve and Slider BABIP of .059 and .087 expected to be much higher
  • Launch speed off Slider: 84.4 mph
  • Launch speed off Curve: 86.4 mph

27 of 32

Pitchers Negatively Regressing

Cole Hamels

  • Actual .251
  • Predicted .301
  • FA Launch angle 11.17°
  • FA Rate: 65.5%

James Shields

  • Actual .270
  • Predicted .320
  • FA Launch speed 89.98 mph
  • FA Rate: 66.74%

Mike Pelfrey

  • Actual .276
  • Predicted .319
  • FA Launch speed 88.9 mph
  • FA Rate: 73.8%

Erasmo Ramirez

  • Actual .269
  • Predicted .310
  • CH Launch angle: 11.9°
  • CH Rate: 20.65%

28 of 32

Risk

  • Variability of year-to-year BABIP
  • Error bars applied to predicted player BABIP

Positive Regression P Safety

  1. Richard
  2. Sanchez
  3. Taillon
  4. Norris

29 of 32

Risk

Positive Regression B Safety

  1. Sandoval
  2. Pederson
  3. Franco

First qualified-season safety

unmeasured because they

lack historical confidence

30 of 32

Model Adjustments

  • Inclusion of non-public data
    • Trackman’s bearing variable
    • Shift by pitch and position of each fielder
  • Inclusion of more years of data
    • More stable BABIP

31 of 32

References

  • https://www.fangraphs.com/blogs/michael-confortos-barreled-balls-werent-ideal/
    • Interpreted the importance of hit direction and pulled hits.
  • https://www.fangraphs.com/tht/hit-to-your-strengths/
    • Researched the value of “going with the pitch” and batters’ swing paths.
  • https://www.fangraphs.com/fantasy/gettin-shifty-with-it-introducing-the-new-xbabip/
    • Showed ways to quantify the shift into BABIP.
  • https://www.fangraphs.com/library/pitching/babip/
    • Defined BABIP.
  • https://www.fangraphs.com/tht/research-notebook-new-format-for-statcast-data-export-at-baseball-savant/
    • Described how to most efficiently export our statcast data.
  • https://www.fangraphs.com/tht/using-statcast-data-to-predict-hits/
    • Displayed which variables are relevant when modeling BABIP.
  • https://www.baseballprospectus.com/news/article/29210/prospectus-feature-the-need-for-adjusted-exit-velocity/
    • Explained the need for adjusted exit velocity.
  • http://www.mattefay.com/imputing-missing-statcast-data
    • Explained how to use random forest models when dealing with missing Statcast data.

32 of 32

Questions?