1 of 19

Imputing Arm Angle from�Readily Available Metrics

Eno Sarris, Matt Dennewitz: Saber Seminar 2025

2 of 19

Agenda

  • What is Pitching+
  • Learning to impute arm angle
  • Broader implications

3 of 19

What is Pitching+

Attempts to evaluate a pitcher's effectiveness by separating their performance into two main components:

  1. Inherent quality of their pitches
  2. Ability to locate those pitches

Expressed via three stats: Pitching+, Stuff+, and Location+.

Designed by Max Bay, Eno Sarris, and Owen McGrattan. Matt Dennewitz on 1s and 0s.

4 of 19

Stuff+

What it measures: The physical characteristics of a pitch, independent of location.

Key components: Velocity, release point, spin rate, and vertical and horizontal movement. Stuff+ determines how "nasty" a pitch is based on these factors, using a decision-tree model to analyze non-linear relationships.

Interpretation: A high Stuff+ rating means a pitch has a high potential for success due to its physical properties, like a high-spin fastball with a lot of "ride" or a slider with significant sweep.

Why it matters: Stuff+ is "sticky," meaning it's a stable metric that tends to hold from year to year. It can be used to predict future performance and to guide player development and pitch design.

5 of 19

Location+

What it measures: A pitcher's ability to locate pitches effectively.

Key components: Location relative to the strike zone, batter handedness, and the current count. Location+ ignores the physical characteristics of the pitch and focuses solely on whether the pitcher is putting the ball in the right spot.

Interpretation: A high Location+ rating indicates a pitcher is consistently hitting their spots and commanding the strike zone.

Why it matters: Location+ provides insight into a pitcher's command, which is a key factor in preventing walks and limiting damage.

6 of 19

Pitching+

What it measures: The overall quality of a pitcher's process by combining both the quality of their pitches and their location.

Methodology: Accounts for physical characteristics, location, and the count for each pitch, as well as batter handedness.

Interpretation: Pitching+ provides a single, comprehensive number that reflects a pitcher's overall skill, removing the noise of on-field results like home runs or bloop singles.

Why it matters: It is a predictive tool. Before a season starts, Pitching+ is more predictive of a pitcher's future performance than many traditional projection systems.

7 of 19

Arm angle

  • Arm angle influences the direction and type of spin imparted. In turn, this affects how the ball moves as it travels toward the plate. Batters also see arm angle and create a model of expected movement, so ‘good stuff’ is often ‘surprising given that arm angle.’
    • It’s also a key ingredient to our Stuff recipe.
  • Arm angle data was not present in Savant feeds, our data source
    • Without arm angle, Stuff is just stuff

8 of 19

Life is pain <-> Wait, no, we’re good Loop

  1. Denial: Maybe we don’t need it? [wrong]
  2. Anger: Use rolling average until arm angle is published next-day [check]
    1. Synthesize arm angle to fill the hole
    2. Replace with real arm angle once available
    3. Fails for debuts, mostly okay for early season work. Bought us time.
  3. Depression: Give Eno’s kids protractors and iPads? [rejected]
  4. Bargaining: Appeal to authority (Mike Pet., Tango) [check]
    • Thanks, guys! Delay is now one day.
  5. Acceptance: Model our way out of this paper bag [check]

9 of 19

Estimating arm angle

Basic arm angle formula:

degrees(arctan(� (release_ball_z - shoulder_z)� / (release_ball_x - shoulder_x + fudge)�))

We have release_* for every pitch in the daily game data.

No shoulder_*... can’t make a triangle without that.

10 of 19

Estimating from only release coordinates

  • Can we cheat the hangman?
    • z correlates to arm angle, not x
  • Stacked models, LightGBM
    • Okay for 35 < angle < 65
    • Gibberish at extremes
  • Neural nets
    • Sticky overfitting issues
    • Tedious
    • Not the joy of this man’s desiring

No good. We must predict shoulder.

Gross!

Aw man, no!

:)

11 of 19

Savant to the rescue

Savant has shoulder vectors in its arm angle data CSV export!

Now we have all the data:

  1. Release x, z
  2. Shoulder x, z
  3. Arm angle

12 of 19

Predicting we get out of this alive

Our problem is now much easier to approach:

  1. Predict, individually, shoulder x and z
  2. Tune hyperparameters by “solving” the arm angle equation

GAMs succeed where linear regression fails because they relax the linearity requirement, letting the model capture complex, realistic feature-outcome relationships automatically.

Optuna made “solving” this very simple.

13 of 19

GAM performance

Relative Shoulder X:

  • RMSE = 0.22
  • R² = 0.3

Shoulder Z:

  • RMSE = 0.12
  • R² = 0.76

Ball Angle:

  • RMSE = 5.7
  • R² = 0.79

X r2: 0.3

Z r2: 0.76

Visualization of GAM regression splines

14 of 19

Model results, 2020-24

15 of 19

Model results, 2020-24

16 of 19

DIY

All code was open-sourced, and available to implement, improve, and extend under MIT.

Code on Github

17 of 19

Caveats

  • Public data isn’t always “correct”
    • Won’t always pass the eye test
    • e.g., MLB and other system readouts can be very different
  • Extreme extension can be tough to navigate
    • Looking at you, Jonah Tong
    • Still below water on submariners
  • Data calibrations can come much later

18 of 19

What it allows us to do, now

  • Now we have an arm angle the morning after a debut
    • Important for quicker analysis
    • No gaps in model output
  • Minor league Stuff+!
    • Triple-A leaders:

19 of 19

What it can allow for, later

  • Bringing MLB-quality Stuff+ models to lower information environments
    • Other minor leagues, college, even training environments
    • Might provide roadmap for filling in other missing data points
  • Anything to learn here for bat path models?
    • There’s a gap between publicly available and team-level bat tracking statistics
    • Could we impute acceleration vectors early in the bat path using a similar approach
  • Generally, a lesson in adaptation
    • Don’t give up because a couple data points are different or missing
    • Bridge the gap