Zach Britton is MLB’s best groundball pitcher. His groundball rates in each of the past two years—twice eclipsing 77 percent—are the highest figures on FanGraphs’ all-time leaderboard. If you don’t watch many Orioles games, you might assume Britton induces his grounders chiefly with sinkers at the bottom of the strike zone. Keeping the ball down, after all, is the sinkerballer’s credo, but the Orioles closer doesn’t fit into that mold.
Against righties, there are ample pitches at and below the knees, but the thicker clusters are in the heart of the strike zone. Similarly, most grounders coming off lefties’ bats came on middle-middle pitches. What gives? The platitude about keeping the ball down hardly applies to Britton’s grounders, and it might be overblown for other pitchers, too. Many other pitch and contextual factors could play a part in predicting grounders. Which matter most?
The plan here will be to predict whether or not a batted ball will be a grounder with the three dozen variables below.
|Pitch Attributes||Vertical location, horizontal location, vertical movement, horizontal movement, velocity, release height, break angle, spin rate, arm angle|
|Situational||Handedness, plate count, leverage index, is double play situation, is sac fly situation, is two out and nobody on, year|
|Batter/Pitcher Talent||Batter’s projected GB%, batter’s projected HR%, entropy of pitcher’s arsenal|
|Previous Pitch Attributes||Vertical location, horizontal location, vertical movement, horizontal movement, velocity, break angle, is fastball, is offspeed, is breaking ball, is hard fastball inside, is slow fastball inside, pitcher pace|
|Previously in the PA||Total number of hard fastballs thrown inside before that pitch (THIFB), total number of slow fastballs thrown inside before that pitch (TSIFB)|
Each of these inputs is its own individual question. Is a grounder more likely when the pitcher throws with a sharp downward plane? What if the pitcher mixes his pitches well (measured by entropy)? What if the situation calls for the batter to loft the ball into the outfield? To what extent does previously pitching inside matter? We’ll put these questions through a model that will answer each simultaneously: a decision tree model with boosting.
You’ve likely come across a standard decision tree, which models an event by taking a data set and stratifying it into spaces. Those trees are nicely straightforward and take a simple flow chart form. But a single decision tree suffers from high variance, where these overfit models do poorly in out-of-sample testing with new data.
Variance can be reduced by fitting the tree many times over, motivating the use of boosting. This adds power to the decision tree framework by fitting a new tree to each current tree’s set of residuals. Over many tree-fitting iterations, this “slow learning” process attacks areas where the model is underperforming and adapts to provide more spot-on predictions. Because the model allows for non-linear relationships and accounts for any potential interaction effects, it’s suitable for spatial data (like pitch locations). Its output won’t be a tidy coefficient formula, but we’ll get the percentage influence of each factor as part of the whole.
I’ll run individual models for three pitch-type groups: fastball, offspeed, and breaking. The broad pitch-type categories (detailed in the appendix) will allow us to hone in on the traits necessary to get a grounder out of a fastball, offspeed pitch, or breaking pitch. For instance, the fastball model can simultaneously address the difference between four-seamers with a bit of tail vs. sinkers with great movement. Pitches on 0-0 counts are excluded at this stage so the previous pitch attributes can be evaluated in full.
After fine-tuning through cross-validation, the models perform reasonably well. By area under the ROC curve (AUC), my predictions would be regarded as fairly good, as per this primer from the University of Nebraska.
The influence scores reported by the boosted trees are normalized to sum to 100, showing the relative importance of each of the variables passed into the models. Pitch attributes will be our starting point, as they do the heavy lifting in explaining grounders. First we’ll put the sinkerballer’s credo to the test, considering vertical location’s influence results and its underlying marginal effects: modeled values of groundball percentage as the other factors are taken at their averages.
In this and several other charts, you’ll notice in the x-axes that PITCHf/x factors were rescaled so lefties and righties of various heights can be compared analogously. Here, a knee-high pitch takes a value of zero.
Vertical location clearly is important. In terms of influence, it’s the most vital factor in offspeed and breaking-ball predictions and a close second in fastball predictions. Still, pitchers should recognize that it’s just ~20 percent of the equation.
What are the practical effects of a well-located pitch? With similarly interweaving curves, it’s clear that down is universally better. One notable difference here is that the curves for secondaries curl up on the right side—showing that as pitchers throw breaking balls and offspeed pitches higher and higher, GB% flattens out and stops dropping. Contrast that with fastballs, for which GB% continues to get worse as pitches are thrown belt-high and above. Locating an otherwise-identical pitch an inch lower raises the probability of a grounder by 1.4 percent on fastballs and 1.2 percent for offspeed and breaking balls.
Let’s now consider the horizontal component of location.
By influence, horizontal location matters quite a bit more for offspeed than hard or breaking stuff, but all chart curves here follow a sinusoidal path. Throwing inside to a batter is a bit better than grooving one (obviously); after passing by the middle of the plate, GB% climbs and climbs the farther outside a pitch is from a batter. I would have guessed that pounding hitters inside with riding sinkers or cutters is a reasonably effective way to get grounders, but that’s hardly the case. Instead, as long as a pitch is thrown at least from the middle of the plate (at dist ≈ 2.25) and outward, each additional inch outside will result in GB% rises of 1.7 percent for fastballs, 1.4 percent for offspeed, and 1.3 percent for breaking balls.
Moving down the line, we’ll next look at movement.
The most interesting takeaway here is that vertical movement is the most crucial factor for fastballs. Heat’s percentage importance nearly doubles the rates owned by secondaries, and its marginal GB% changes faster.
Still, for all pitches, the more downward movement—the more negative the number is—the better the resulting GB%. “Rising” pitches—positive movement—aren’t good for grounders. The penalty eases up slightly at the highest rungs of movement, although the curves are still strongly linear (with correlations in excess of 0.98 in all three instances). All else equal, each additional inch of downward movement will increase GB% by 3.5 % percent for fastballs, 2.2 percent for offspeed pitches, and 1.7 percent for breaking balls.
Lateral movement is among the most important characteristics for fastballs, comprising nearly 10 percent of the recipe. The chart shows that fastballs moving in towards hitters (with negative movement) are effective groundball pitches; those are cutters to opposite-handed hitters and sinkers to same-handed hitters. For each additional inch of inward movement beyond 1.5 inches, a pitcher can raise his fastball GB% by 2.5 percent. Any outward movement beyond that hurts the groundball effort.
It’s easy to envision a hitter swinging at a fading changeup and weakly grounding out to the pull side. But the extent that an offspeed (or breaking) pitch moves away from a batter is of no consequence. This is reflected in small influence figures and flat marginal curves. If anything, sliders and changeups moving in towards batters are a teeny tiny bit more effective, but in the end, lateral movement shouldn’t be part of the pitcher’s calculus if he’s looking to turn an offspeed or breaking pitch into a grounder.
Velocity is a big part of Britton’s ability to overpower hitters; how does it help groundball percentage?
In terms of influence, velocity is triply more important for secondary pitches than for fastballs. Yet all the velocity curves are similar, being close to parallel as they proceed on extremely linear paths. With an extra 1.0 mph of velocity, an otherwise identical pitch will yield a 1.5 percent rise in GB% for fastballs, a 1.7 percent GB% bump for offspeed pitches, and a 1.6 percent jump for breaking balls.
The rest of the pitch attributes, shown below, hold much lesser weight in prediction. Surprisingly, release height is among this group.
When a short pitcher comes along, there’s a question of whether his lack of downward plane will make it hard to get hitters out. But height doesn’t measure heart, and it also isn’t much of a groundball catalyst. It helps some; an extra upward inch in release height, for instance, adds a 0.4 percent GB% boost on fastballs. But again, there are much larger groundball rises to be had if a pitcher can squeak out inch-level improvements in location and movement. Short guys shouldn’t be deterred if their groundball stuff is otherwise solid.
Next we’ll move onto the other variable categories. The influence figures for the situational factors are shown below.
|Sac fly situation||0%||0%||0.1%|
|Two out nobody on||0%||0.1%||0%|
Even if a batter wants to hit a sac fly, stay out of a double play, or launch a home run with two outs and the bases empty, there won’t be any change in whether his batted ball is a grounder or not. Whether or not the batter finds himself in a clutch situation hardly matters either. Failing to pick up crucial sac flies can be frustrating, but maybe we should give batters a pass, as the outcome appears to be out of their control and counterbalanced by the pitcher’s desire to prevent a fly ball.
Plate count matters a bit on fastballs—GB% trickles down by about one percent as the count becomes more favorable to the hitter.
Another finding here is that batter/pitcher handedness, in and of itself, is irrelevant. It’s how pitches move that is important. That’s a nontrivial distinction, particularly when many managers are wedded to making substitutions that optimize the traditional left/right platoon. Pitcher arsenals need to be considered when making relief and pinch-hitting substitutions.
|Batter Proj. GB%||12.3%||11.4%||13.5%|
|Batter Proj. HR%||1.2%||1.2%||1.6%|
Yes, it’s true: groundball-hitting batters hit grounders. These influence figures hover around ~12 percent, far less than the pitch attributes discussed above. The upshot here is that pitchers are much more in control of whether or not the ball is hit on the ground.
The batter’s home run talent and pitcher’s ability to mix pitches hold virtually no significance, the same fate that meets the previous pitch characteristics.
|Is hard fastball inside||0%||0%||0%|
|Is slow fastball inside||0%||0%||0%|
The way a pitcher immediately sets up the ball-in-play pitch is pretty unimportant in generating grounders. If all totaled, we can see that the “previous” variables are altogether a bit mightier—they do compose about one-tenth of the recipe—but improving a pitch’s GB% this way can only be done in tiny increments. All pitches see slight groundb all bumps if an inside pitch or a low pitch precedes the BIP pitch. Pitches’ groundball-friendliness also can get little boosts if the prior pitch “rises” high, is slower, or comes at a quick pace.
A pitcher who does all these things well can raise his GB% a few ticks. But the greatest increases come when pitchers improve their movement or location instead of sequencing. The last table shows that even the long-revered brushback pitch is inconsequential.
Left out of the original analysis were a pair of extra factors that are worth testing. In previous research, Baseball Prospectus’ Harry Pavlidis found that to get a grounder from a pitched change-up, it’s good if there’s a small gap between the fastball and offspeed offering, and it’s good if the change-up sinks more relative to the fastball. I went back to re-run the offspeed model with these variables included. The direction of my results were in agreement with his: Offspeed pitches perform better with smaller velocity differentials and more sink than fastballs. But the big difference in my results is that these factors hold little import. Each factor hits just over three percent importance.
The difference, I’d think, is due to my use of more rigorous methods that further take context into account. Between this latest result and the lackluster results from the other sequencing variables, it’s clear that pitches have a natural gorundball talent unto themselves, largely distinct from other aspects of the arsenal.
Wrapping up with the Best Groundball Pitchers
This analysis shows that keeping the ball down is just 20 percent of the groundball puzzle, a lower estimation than most sinkerballers surely would guess. It’s important, but the same can be said of several other factors. Velocity, both components of movement, horizontal location, and the batter’s own groundball tendencies matter a great deal, and other factors also claim smaller chunks of predictive power.
So who does the model predict as the best groundball pitchers? The table below shows the top ten player-seasons by predicted GB% on all pitches (min. 100) through 2015. For completeness, separate models were run to make groundball estimates for pitches coming on 0-0 counts, and those predictions are included in these tallies.
|Mike MacDougal||White Sox & Nationals||2009||78.5%|
Despite the upward locations in the initial chart, the models identify Britton as the best groundballer in the PITCHf/x era. His fastball’s velocity, downward movement, and lateral movement are all at the top of the class. Joining Britton at the top are several of the best sinkerballers of the past eight years.
Notice also there are more seasons coming from the Pirates than any other club. The model loves Jared Hughes, and John Holdzkom’s dominant nine-inning stint with the 2014 Buccos was enough to earn him a 7th-place ranking. This isn’t a surprise, given the Pirates’ devotion to a strategy of creating grounders to be gobbled up into defensive shifts. There are interesting questions that follow such as, what impact does being a Pirate have on a pitcher’s groundball percentage? Tomorrow we’ll examine that question and take a closer look at Pittsburgh’s strategy.
References & Resources
- Gareth James, Daniela Witten, Robert Tibshirani, and Trevor Hastie, An Introduction to Statistical Learning with Applications in R
- Jonah Pemstein, FanGraphs, “Batted Balls: It’s All About Location, Location, Location”
- The source of the arm angle data is Jared Cross’ terrific research on projecting pitcher platoon splits.
- PITCHf/x and Retrosheet
- Andrew Bare, MLB.com, “How to throw a sinker”
- Rob Arthur, Baseball Prospectus, “Baseball ProGUESTus: Entropy and the Eephus”
- David Manel, Bucs Dugout, “Pregame: Indians, Terry Francona well aware of Pirates’ reputation for pitching inside”
- Data School, YouTube, “ROC Curves and Area Under the Curve (AUC) Explained”
- Thomas G. Tape, MD, University of Nebraska Medical Center, “The Area Under an ROC Curve”
- Harry Pavlidis, Baseball Prospectus, “What Makes A Good Changeup? An Investigation, Part Two”
Here are a few technical PITCHf/x details:
- Movement figures are drag-corrected.
- Batter standing positions for the horizontal movement variables are estimated using Mike Fast’s research on hit by pitches.
- The release height data is adjusted for park biases using a familiar process, but on a full-year basis instead of a rolling basis.
MLBAM pitch types are categorized as follows:
|4-seam fastballs (FF)||Changeups (CH)||Curveballs (CB)|
|2-seam fastballs (FT)||Splitters (FS)||Knucklecurves (KC)|
|Sinkers (SI)||Forkballs (FO)||Sliders (SL)|
|General fastballs (FA)||Cutters (FC) with above-average horizontal movement|
|Cutters (FC) with below-average horizontal movement|