1 of 14

Hoops and Algorithms: Predicting NBA Winners

2 of 14

Big Picture

Expanding Accessibility & Fan Engagement –

Feature Variables

  • Time left
  • Score of the game
  • Skill of the teams

Real time predictions of an NBA game

Inputs Live data to calculate win probability

Assists in informed strategy decisions

Enhances Live Sports Broadcasting

3 of 14

Historical Model Process

Clean Data

Training

4 of 14

Historical Model

5 of 14

Historical Model

6 of 14

Data Source

The input data is from the airflow, which came from a public API

Key Features

Used feature variables of the score difference of the game, time left in the game and the pregame win probability

Model Training

Trained the model on past in-game data

Visualization

Built a matplotlib that shows the changes of win probability based on the time left in the game

Building a Real-Time Model

7 of 14

Real-Time Analysis

8 of 14

Implementing an

ELT Pipeline

9 of 14

Snowflake

10 of 14

dbt

11 of 14

DAGs and Cosmos DAG

12 of 14

Challenges

Data Types

Different Data Types

RAM

Ram Limitations

Integrating DAGS

Integrating multiple DAGs into Cosmos' single DAG structure

13 of 14

Next Steps

  • Computational constraints that limit data we can process
  • The complexity of capturing all relevant factors in a fast-paced game
  • Able to use our models and scrape live games in order to predict the winner in a game that is currently happening

Limitations in our current model:

  • Incorporate more advanced features, such as player matchup data and team-specific strategies
  • Create a daily DAG to combine the historical timed and live timed data and retrain the timed model, then utilizing Apache XCOM to carry those coefficients over to the version in the live game DAG

Future Improvements:

14 of 14

Thank You!