Introduction
Knowing the probability of a shot being made is essential in basketball. There are many different factors that can affect the probability of a shot going in such as shot-distance, nearest-defender distance, shot type, time left in the quarter, home/away, etc. In this paper, we look at trying to predict the probability of a shot being successful for 133 players during the 2014-15 NBA regular season. After using classical machine learning models, we used three other methods attempting to improve our accuracy, comparing the results. Since a subset of our predictors affect each player differently, we used mixed effects models. We used a generalized linear mixed effects model (Julia), mixed effects random forest model (MERF python package), and Bayesian modeling (RStan).
Materials and methods
= Vector of prediction coefficients for player j
= Predictor of the ith shot for player j
𝛽 = A vector of fixed effects coefficients
X = A vector of fixed effects variables
u = A vector of random effects variables
𝑍 = The random effects design matrix
yj = ƒ(Xj) + Zj * bj + ej
ƒ(.) = Random forest regression function for fixed effects
Xj = (nj * p) matrix of fixed effects covariates for player j.
Where nj represents the number of shots taken by player j. And where p represents the number of fixed effect covariates
Zj = (nj * q) matrix of random effects covariates for player j. Where
represents the number of random effect covariates.
bj = (q * 1) vector of random effect coefficients for player j ~N(0, σ2b)
ej = (nj * 1) vector of errors for player j ~N(0, σ2e)
Results
Conclusions
The results of introducing random effects into models for shot classification definitely is thought-provoking. The results of all three hierarchical techniques yielded intriguing results, and even showed improvement over traditional models in the case of the MERF package. Not only are the results significant, but also the models are also valuable due to the potential to use the random effects coefficients to understand player strengths and weaknesses.
Potential Benefits
Areas for Improvement
Dan Barlow, James Bury, Spencer Siegel (University of North Carolina at Chapel Hill)
Hierarchical Modeling to Predict Shot Outcome in NBA Games
Acknowledgments
Special thanks to Dr. Richard L. Smith of University of North Carolina at Chapel Hill for providing guidance and support throughout the project.
Data used from Kaggle, https://eightthirtyfour.com/data, and Basketball Reference.
Literature cited
computing. R Foundation for Statistical Computing, Vienna, Austria.
https://github.com/manifoldai/merf/tree/master/merf
Mixed Effects Random Forest (MERF)
Example Plot of Random Effect (Pts Type- 2 or 3 pointer) :
Players in far right bin: Steph Curry, Joe Johnson, Brandon Knight, Tim Duncan, Al Jefferson
Generalized Linear Mixed Effects Model