1 of 19

Imputing Arm Angle from�Readily Available Metrics

Eno Sarris, Matt Dennewitz: Saber Seminar 2025

2 of 19

Agenda

What is Pitching+
Learning to impute arm angle
Broader implications

3 of 19

What is Pitching+

Attempts to evaluate a pitcher's effectiveness by separating their performance into two main components:

Inherent quality of their pitches
Ability to locate those pitches

Expressed via three stats: Pitching+, Stuff+, and Location+.

4 of 19

Stuff+

What it measures: The physical characteristics of a pitch, independent of location.

Key components: Velocity, release point, spin rate, and vertical and horizontal movement. Stuff+ determines how "nasty" a pitch is based on these factors, using a decision-tree model to analyze non-linear relationships.

Interpretation: A high Stuff+ rating means a pitch has a high potential for success due to its physical properties, like a high-spin fastball with a lot of "ride" or a slider with significant sweep.

Why it matters: Stuff+ is "sticky," meaning it's a stable metric that tends to hold from year to year. It can be used to predict future performance and to guide player development and pitch design.

5 of 19

Location+

What it measures: A pitcher's ability to locate pitches effectively.

Key components: Location relative to the strike zone, batter handedness, and the current count. Location+ ignores the physical characteristics of the pitch and focuses solely on whether the pitcher is putting the ball in the right spot.

Interpretation: A high Location+ rating indicates a pitcher is consistently hitting their spots and commanding the strike zone.

Why it matters: Location+ provides insight into a pitcher's command, which is a key factor in preventing walks and limiting damage.

6 of 19

Pitching+

What it measures: The overall quality of a pitcher's process by combining both the quality of their pitches and their location.

Methodology: Accounts for physical characteristics, location, and the count for each pitch, as well as batter handedness.

Interpretation: Pitching+ provides a single, comprehensive number that reflects a pitcher's overall skill, removing the noise of on-field results like home runs or bloop singles.

Why it matters: It is a predictive tool. Before a season starts, Pitching+ is more predictive of a pitcher's future performance than many traditional projection systems.

7 of 19

Arm angle

Arm angle influences the direction and type of spin imparted. In turn, this affects how the ball moves as it travels toward the plate. Batters also see arm angle and create a model of expected movement, so ‘good stuff’ is often ‘surprising given that arm angle.’

It’s also a key ingredient to our Stuff recipe.

Arm angle data was not present in Savant feeds, our data source

Without arm angle, Stuff is just stuff

8 of 19

Life is pain <-> Wait, no, we’re good Loop

Denial: Maybe we don’t need it? [wrong]
Anger: Use rolling average until arm angle is published next-day [check]

Synthesize arm angle to fill the hole
Replace with real arm angle once available
Fails for debuts, mostly okay for early season work. Bought us time.

Depression: Give Eno’s kids protractors and iPads? [rejected]
Bargaining: Appeal to authority (Mike Pet., Tango) [check]

Thanks, guys! Delay is now one day.

Acceptance: Model our way out of this paper bag [check]

9 of 19

Estimating arm angle

Basic arm angle formula:

degrees(arctan(� (release_ball_z - shoulder_z)� / (release_ball_x - shoulder_x + fudge)�))

We have release_* for every pitch in the daily game data.

No shoulder_*... can’t make a triangle without that.

10 of 19

Estimating from only release coordinates

Can we cheat the hangman?

z correlates to arm angle, not x

Stacked models, LightGBM

Okay for 35 < angle < 65
Gibberish at extremes

Neural nets

Sticky overfitting issues
Tedious
Not the joy of this man’s desiring

No good. We must predict shoulder.

Gross!

Aw man, no!

11 of 19

Savant to the rescue

Savant has shoulder vectors in its arm angle data CSV export!

Now we have all the data:

Release x, z
Shoulder x, z
Arm angle

12 of 19

Predicting we get out of this alive

Our problem is now much easier to approach:

Predict, individually, shoulder x and z
Tune hyperparameters by “solving” the arm angle equation

GAMs succeed where linear regression fails because they relax the linearity requirement, letting the model capture complex, realistic feature-outcome relationships automatically.

Optuna made “solving” this very simple.

13 of 19

GAM performance

Relative Shoulder X:

RMSE = 0.22
R² = 0.3

Shoulder Z:

RMSE = 0.12
R² = 0.76

Ball Angle:

RMSE = 5.7
R² = 0.79

X r2: 0.3

Z r2: 0.76

Visualization of GAM regression splines

14 of 19

Model results, 2020-24

15 of 19

Model results, 2020-24

16 of 19

DIY

All code was open-sourced, and available to implement, improve, and extend under MIT.

Code on Github

17 of 19

Caveats

Public data isn’t always “correct”

Won’t always pass the eye test
e.g., MLB and other system readouts can be very different

Extreme extension can be tough to navigate

Looking at you, Jonah Tong
Still below water on submariners

Data calibrations can come much later

18 of 19

What it allows us to do, now

Now we have an arm angle the morning after a debut

Important for quicker analysis
No gaps in model output

Minor league Stuff+!

Triple-A leaders:

19 of 19

What it can allow for, later

Bringing MLB-quality Stuff+ models to lower information environments

Other minor leagues, college, even training environments
Might provide roadmap for filling in other missing data points

Anything to learn here for bat path models?

There’s a gap between publicly available and team-level bat tracking statistics
Could we impute acceleration vectors early in the bat path using a similar approach

Generally, a lesson in adaptation

Don’t give up because a couple data points are different or missing
Bridge the gap