Imputing Arm Angle from�Readily Available Metrics
Eno Sarris, Matt Dennewitz: Saber Seminar 2025
Agenda
What is Pitching+
Attempts to evaluate a pitcher's effectiveness by separating their performance into two main components:
Expressed via three stats: Pitching+, Stuff+, and Location+.
Designed by Max Bay, Eno Sarris, and Owen McGrattan. Matt Dennewitz on 1s and 0s.
Stuff+
What it measures: The physical characteristics of a pitch, independent of location.
Key components: Velocity, release point, spin rate, and vertical and horizontal movement. Stuff+ determines how "nasty" a pitch is based on these factors, using a decision-tree model to analyze non-linear relationships.
Interpretation: A high Stuff+ rating means a pitch has a high potential for success due to its physical properties, like a high-spin fastball with a lot of "ride" or a slider with significant sweep.
Why it matters: Stuff+ is "sticky," meaning it's a stable metric that tends to hold from year to year. It can be used to predict future performance and to guide player development and pitch design.
Location+
What it measures: A pitcher's ability to locate pitches effectively.
Key components: Location relative to the strike zone, batter handedness, and the current count. Location+ ignores the physical characteristics of the pitch and focuses solely on whether the pitcher is putting the ball in the right spot.
Interpretation: A high Location+ rating indicates a pitcher is consistently hitting their spots and commanding the strike zone.
Why it matters: Location+ provides insight into a pitcher's command, which is a key factor in preventing walks and limiting damage.
Pitching+
What it measures: The overall quality of a pitcher's process by combining both the quality of their pitches and their location.
Methodology: Accounts for physical characteristics, location, and the count for each pitch, as well as batter handedness.
Interpretation: Pitching+ provides a single, comprehensive number that reflects a pitcher's overall skill, removing the noise of on-field results like home runs or bloop singles.
Why it matters: It is a predictive tool. Before a season starts, Pitching+ is more predictive of a pitcher's future performance than many traditional projection systems.
Arm angle
Life is pain <-> Wait, no, we’re good Loop
Estimating arm angle
Basic arm angle formula:
degrees(arctan(� (release_ball_z - shoulder_z)� / (release_ball_x - shoulder_x + fudge)�))
We have release_* for every pitch in the daily game data.
No shoulder_*... can’t make a triangle without that.
Estimating from only release coordinates
No good. We must predict shoulder.
Gross!
Aw man, no!
:)
Savant to the rescue
Savant has shoulder vectors in its arm angle data CSV export!
Now we have all the data:
Predicting we get out of this alive
Our problem is now much easier to approach:
GAMs succeed where linear regression fails because they relax the linearity requirement, letting the model capture complex, realistic feature-outcome relationships automatically.
Optuna made “solving” this very simple.
GAM performance
Relative Shoulder X:
Shoulder Z:
Ball Angle:
X r2: 0.3
Z r2: 0.76
Visualization of GAM regression splines
Model results, 2020-24
Model results, 2020-24
DIY
All code was open-sourced, and available to implement, improve, and extend under MIT.
Code on Github
Caveats
What it allows us to do, now
What it can allow for, later