Throw It Away

Mark Buehrle has mad a career living on the outside edge of the strike zone. (via Keith Allison)

Mark Buehrle has mad a career living on the outside edge of the strike zone. (via Keith Allison)

Last year, Bill Petti and Jeff Zimmerman introduced a concept called Edge% that looked to slice the zone into four distinct zones: horizontal edge, top edge, bottom edge and heart. I want to delve a little bit deeper into the mechanics of this, specifically focusing on the horizontal edge and hopefully explaining three things:

  1. Why is the edge better? Contact management, called strikes, ball prevention or swings and misses?
  2. Why is this not always a good indicator for pitcher-specific performance?
  3. Should the inside edge be looked at the same way as the outside edge?

Let us begin by trying to explain the continuous trade-offs that occur dynamically as we move across the horizontal spectrum of the strike zone. For this purpose, I’ve built a graph which shows all pitch outcomes for all pitches thrown since 2008 in regular season games that have one of these outcomes — Hit, Out, Foul, Strike, Ball. A few notes on how to read the graph:

  1. Value (thick black line) is the run prevention value of the pitch computed based on the pitch’s count specific value as per Joe Sheehan; while this is a little dated, it should be directionally correct throughout the years. It is important to use count-specific run values, as a waste pitch can be very valuable in a 1-2 count, but horrible in a 3-1 count. The assumption being that the pitcher is going to attempt to throw in a count-optimized location. Negative run values are better for pitchers. Compare and contrast this to the work by Max Marchi many years ago which looked at run values for pitches thrown horizontally and vertically in the strike zone. There are some similarities in the results, but also some divergence on pitch-type specific values.
  2. Delta X is the handedness-neutral location of the pitch, measured in feet. I flipped the values for lefty hitters, so that negative is always inside and positive is always outside. The edges of the plate are marked at -0.7 and 0.7 feet.
  3. The colorful lines represent the probability of a specific outcome happening at any location within the zone. For example, there is a 30 percent chance of a pitch being called a ball on the inside edge (and an 18.5 percent chance of getting a called strike) compared to 25 percent and 28.4 percent (ball%, called strike %) on the outside edge. More on this later, but note how called strikes has a differently shaped curve than the rest of the outcomes.
porat 1

Why is it better to throw a pitch on the outside edge?

porat 2

There is a lot going on here, so let’s try to break it down:

  • Swing % is almost perfectly correlated with absolute distance from the center of the zone (.96 R2) whereas ISOContact and SLGContact are only loosely related. It is also strongly correlated to the number of pitches thrown to a location, which suggests that hitters are looking middle and adjusting accordingly, rather than looking in their sweet spot and then adjusting.
  • Swing % peaks at exactly the middle of the zone, but declines 9 points to the inner edge compared to 14 points on the outer edge. This leads to called strikes spiking right on the edge of the zone. ISO on Contact peaks roughly essentially at the halfway point between the edge of the plate and the middle of the plate. As mentioned in the previous note, you would assume that swing% would peak when quality of contact is greatest, but it actually peaks in the middle of the plate where ISOContact is lower.
  • A called ball is also more likely on a take, when pitched on the inside edge versus the outside edge. Not exactly sure why this is but my theory is that a ball which makes the batter flinch and lean back, is often called a ball, even when it clips the black, as compared to a ball on the outside black which the hitter won’t react to.
  • Pretty much every factor is favourable for a pitcher on the outer half: Whiff% is seven points higher, ISOContact is 145 points lower and called strikes are 10 points higher.
  • The only metric that increases is BABIP, which jives with the notion that going the other way can lead to a higher BABIP. Clearly, getting jammed inside will have a deleterious impact on BABIP and is likely a pitcher skill if a swing can be induced (a topic for another day).

I’m going to draw an interesting conclusion from these data and I welcome feedback on the subject. The R2 correlation between Called Strike% and Value is 0.94; this is an order of magnitude greater than the other components. You can also see that the value curve and called strike curves move together, whereas the other curves have classic normal curve distributions.

Horizontal Location and the Effect on Pitch Trajectory

porat 3

The black line represents count-context pitch values across the horizontal spectrum. Note how ground balls show a mostly linear increase from -.1 feet all the way out 1.3 feet from the center of the zone as well as the negative correlation between GB% and FB%. FBs both peak around the -0.35, as well as the damage done on those FBs. It appears that the sweet spot for hitters is halfway between the inner edge of the plate and the middle of the plate. Batters who swing at pitches way inside have very little chance of doing anything meaningful, but can still hit line drives and all the way out to 1.4 feet from the center of home plate.

Why is this not always a good indicator for pitcher-specific performance?

I computed pitcher-specific run values based on their location profile and came up with basically no correlation at all, when trying to predict a pitcher’s ERA based on where they threw the ball. However, this effect did have a strong correlation when looking at groups of pitchers. The question then becomes, why does this break down at the pitcher level? There are two main reasons why:

  1. Pitch value is extremely count-sensitive
  2. Some pitchers are great (or horrible) everywhere in the zone

Let’s take a look at the outcome distributions for six different counts:

porat-4
  • In 3-1 counts, you want to throw right down the middle and the edges are bad (relatively speaking). You also have very little room for error on a 3-1 count as a mere 0.2 feet on the outside edges has a very steep curve where batter benefit greatly.
  • In 1-2 counts, there is tremendous value (and very little risk) in throwng the ball off the plate (better outside, but inside off the plate is good too).
  • Two strike counts, throwing a pitch 0.5 feet off the outside edge will still create above average run prevention. They also share a similar called strike profile, indicating a shift in batter approach with two strikes.
  • In 3-1 counts a batter will swing at 62 percent of pitches on the inside edge, compared to a 69 percent in 1-2 counts.
  • On the outside edge, the jump is more dramatic, from 46 percent on 3-1 counts to 66 percent on 1-2 counts, an indication that hitters will swing at almost everything on a 1-2 count

This would suggest that pitchers who are very good at getting to two strike counts will throw fewer pitches on the edge than pitchers who are always behind in the count (since these is surplus value in throwing a ball, rather than a strike). Thus, pitchers who exhibit a key skill (getting ahead) would show a lower edge%, which will add a lot of noise to the data.

The next set of pictures contrasts Mark Buehrle with Clayton Kershaw and Max Scherzer.

porat-5
  • Buehrle can be very effective when pitching on the outside, with above average run prevention all the way out to almost 0.5 foot off the outside edge. His zone of effectiveness is essentially +-0.5 feet from the outside edge, everywhere else it’s pretty neutral.
  • For Kershaw, it doesn’t matter where he throws it horizontally, he is well above average almost everywhere. He appears to be better slightly better off on the inside, but the difference is negligible. The best pitcher in baseball, surprisingly, doesn’t need to rely on pitching to the edges of the plate to be dominant.
  • Scherzer has a clear trough starting around -0.2 feet all the way out to 1.1 feet. If you’re Scherzer, you’re not going to care about the edge (esp. the inside edge), you’d rather aim for the outer middle half (+0.35 feet) to maximize the probability of having a pitch land in the sweet spot.
  • I want to point out that this of course ignores the game theory concept of batters sitting on a specific location, so clearly a pitcher will have to randomize their locations enough so that a hitter can’t optimize their swing for a specific location. This could help explain why Kershaw is so much more dominant – it’s hard to read the small writing, but Kershaw has more pitches on the inside edge than the outside, as opposed to most pitchers (including Scherzer) who throw a lot more pitches on the outside. This essentially makes Kershaw less predictable. I digress a little, but the point I want to make is that all of these value graphs that I am showing do not take into account how predictable a pitcher is at throwing a pitch in a certain spot; obviously always throwing on the outside is not beneficial, but there is probably an ideal mix across the horizontal spectrum.
  • What the above graphs show is that for certain pitchers, Edge% will be informative (Buerhle would be a good example, at least on the outside edge), but for others, it can be very misleading, as they accrue not benefit from pitching to the edge of the zone.

Some Notes on the Effects of Pitch Types

porat-6

Take a look at the dark purple line, which indicates the probability of a called strike. You’ll notice that four seam fastballs have a huge spike around the outside edge of the plate, a pattern not shared with the other three pitch types shown. It’s also interesting to point out that fastballs and changeups are similar to each other in their value curves, as are sliders to curveballs. These curves would suggest that four seam fastballs and changeups should be thrown off the plate, whereas curveballs and sliders are definitely better on the outside edge, but can also generate value on the inner half.

Conclusion

At the macro level, when we look at the entire population of pitchers, there is a distinct and very strong correlation between the horizontal location of a pitch and the outcome of that pitch. This is clearly apparent with the smooth curves throughout the horizontal spectrum. There is a clear benefit to pitching towards the outside edge, but that this benefit is largely ±0.3 feet away from the outside edge. Pretty much all pitch outcomes are better for the pitcher on the outside, except for the probability of giving up a single. The inside edge has surplus run-prevention value, but it you are better off throwing it from +0.2 feet to 0.9 feet from the center of the zone. Hitters do best when swinging at pitches roughly half way between the inside edge and the center of the zone.

Inside all these data live a lot of noise, specifically the effects the count has on pitch value, as well as the specific effectiveness of an individual pitcher across the horizontal spectrum of the strike zone. This allows for general conclusions to be drawn, but not necessarily specific conclusions about specific pitchers.

References & Resources

A Hardball Times Update
Goodbye for now.

Eli Ben-Porat is a Senior Manager of Reporting & Analytics for Rogers Communications. The views and opinions expressed herein are his own. He builds data visualizations in Tableau, and builds baseball data in Rust. Follow him on Twitter @EliBenPorat, however you may be subjected to (polite) Canadian politics.
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Peter Jensen
8 years ago

I find it more helpful to have reference lines drawn at the edge of the rule book strike zone rather than the edge of the plate. Since Pitch Fx gives the location of the center of the ball as it passes the front of the plate and a strike is any part of the ball passing over the plate half of the diameter of the ball should be added to the edge; .83 feet instead of .70 feet.

I can enlarge the graphs that are presented individually so they are readable, but the graphs that are presented in groups are still not readable after enlargement. This is probably an artifact of my very old computer, but it is a significant obstacle to understanding if it is also happening to others. In general I find your graphs confusing because of the multitude of lines on each one. Max’s article drives home each of his points by having a separate graph for each with no more than two lines.

Max in the first graph of his presentation showed the significant differences in pitches throw to the same locations by same and opposite handed pitchers. this particularly important when you are looking at specific pitch types. But you have made no note of it in your post.

Not limiting your pitches to those that are located within the strike zone vertically lessens the impact of their differences in horizontal location. Some of the lack of significance that you found can be attributed to this and the aforementioned lack of discrimination between same and opposite handed pitchers.

Eli BP
8 years ago
Reply to  Peter Jensen

Thanks for the feedback Peter. First time article, we’ll work out the kinks moving forward. In future articles with complex visualizations, we’ll either break them down a little, or post a Tableau public to make them more accessible.

I looked at this within the strike zone as well, and while this did change the impact, the shape of the curves were exactly the same (called strikes moved up, called balls moved down, etc). Handedness was also not too significantly different, though it would have been interesting to show the cross-section of same vs. different for the various curves. What I was going for was more to illuminate the dynamic effects of count, location and pitcher to outcomes and how they all follow very distinct curves.