Are The Umpires At It Again?

Pitchers have been throwing to different parts of the plate than they did last year. (via Brian Mills)

Pitchers have been throwing to different parts of the plate than they did last year. (via Brian Mills)

Editor’s Note: This piece was initially given as a presentation at the marvelous 2016 Saberseminar.

Back in 2014, it became clear that a large portion of the decrease in run scoring during the 2000s drop – as much as 40 percent – could be attributed to umpires expanding their strike zone downward about three inches. This was confirmed by three independent researchers: Jon Roegele here at THT, Ben Lindbergh at Grantland, and yours truly. The trend in the larger strike zone had started around 2009 and continued through 2014. Prior to the 2015 season, Jeff Passan reported it was something the league would look into, particularly if the low-scoring games were less interesting to fans. But, as Jon showed, the strike zone expansion continued as the 2015 season began.

Beginning last August, however, home runs started to increase and offense made a small comeback. In 2016 that trend has continued, and data from MLB’s new Statcast system show exit velocity is to blame: The ball is coming off the bat harder than it was last year. This has been well documented by others.

The increase in exit velocity and home runs has led to various theories about the return of steroids or “juiced” baseballs. These are pretty serious accusations. The former implies impropriety among players. The latter has precedent I’m sure Rob Manfred would prefer to avoid. It cost Ryozo Kato, the commissioner of the Nippon Professional League in Japan, his job.

Most recently, Rob Arthur and Ben Lindbergh put forth some interesting evidence regarding the juiced-ball theory over at FiveThirtyEight. But Alan Nathan presented some evidence here at THT that was inconsistent with the juiced-ball theory. The investigation into the juiced ball, at least to some extent, seems fueled by the findings of Jon Roegele after the 2015 season, concluding there wasn’t much changing there.

Contrary to Jon’s conclusion, my own data hinted at something different. When I looked at the average height of called strikes–an admittedly surface-level look–it seemed to be ticking upward in the latter part of the 2015 season.

Pairing that with Alan’s more recent presentation, I was rather skeptical that the juiced ball was a good explanation of the home run and scoring increases. But I wanted a larger sample than the last two months of the 2015 season to follow up on Jon’s findings with the strike zone.

So I decided to investigate for my Saberseminar talk. I figured I could show the evidence that, as before, the umpires are a large culprit in all of this. I returned to the data last month, adding games through July, 2016. The trend in called-strike height seemed to continue.

LOESS of Daily Average Called Strike Height

LOESS of Daily Average Called Strike Height

It would seem reasonable to expect much of this change would be caused by umpires cutting down on strikes at the knees. But it also could be a selection bias issue or simply an expansion of the zone up high. So let’s take a closer look. The obvious question from all of this is: What does the strike zone look like now, and has this impacted run scoring since the 2015 All-Star Game?

I’ve been modeling the strike zone using PITCHf/x data since about 2010 using the same non-parametric method, a generalized additive model (GAM). GAMs are relatively flexible and allow us to make pretty pictures of the zone as well as control for various factors related to changes in the strike zone (like the ball-strike count). I’ll start with a simple GAM of the strike zone to get you acquainted with the visuals, and I’ll tell the rest of my story mostly in pictures. (Note: You can see enlarged versions of any of the following visuals if you open them in a new tab.)

Strike Zone in the Pre-All Star Game 2015 Era

Strike Zone in the Pre-All Star Game 2015 Era

A Hardball Times Update
Goodbye for now.

Notice that the darker the red, the more likely a pitch will be called a strike. The darker the blue, the less likely it will be called a strike in that location. There’s nothing particularly surprising here. Pitches down the middle are almost always called strikes. And pitches five feet off the ground and two feet inside are never called strikes.

But comparing these figures across time periods is a bit difficult, especially with small changes. It’s much easier to plot the changes in the strike zone for data prior to the 2015 All-Star Game (the “PreASG15” era, starting at the beginning of 2015) and data from after the 2015 All-Star Game (the “PostASG15” era continuing through July 23, 2016).

In other words, I subtract the probability a pitch–given its location–is called a strike in the PreASG15era from the probability that same pitch is expected to be called a strike in the PostASG15 era. The result is below.

Change in Called Strike Probability After 2015 All-Star Game

Change in Called Strike Probability After 2015 All-Star Game

Notice the scale is similar to the previous plot. As the probability of a strike decreases in the PostASG15 era, relative to the PreASG15era, it is darker blue. On the other hand, if the probability of a strike call increases in the PostASG15 era, it is darker red. White or light grey is a neutral color, meaning there is no change from the PreASG15to the PostASG15 era in strike probability.

The story is relatively clear. For both right-handed and left-handed batters, the probability of a strike call on low pitches has decreased substantially. (The base rate of called strikes for these pitches ranges from 20 to 60 percent.) In some cases, the probability of a low outside strike has decreased by as much as 12 percentage points. Much of this difference is low and outside, pitches we know to be more difficult to hit with high exit velocities or for home runs.

Interestingly, there are also more strikes being called up in the zone. While this would mitigate some of the decrease in the total size of the strike zone, it could increase the rate at which balls are hit in the air, possibly leading to more home runs.

As an academic interested in economics and incentives, my first thought was that the players must have recognized this change and adjusted their behavior strategically. For example, we might expect these changes to induce pitchers to throw up in the zone and over the plate more often. With more pitches over the plate, it’s possible batters are squaring the ball up more consistently since they don’t have to worry about those low, outside pitches as much. The behavioral change would be easy enough to confirm in the data.

To identify changes in pitch location, I switched to using kernel density estimation, which simply evaluates the proportion of pitches in a given location, rather than the probability of calling those pitches strikes (or some other event), given location. I use the same differencing method as with the strike zone, calculating the change in the rate that pitchers throw to certain areas of the strike zone in the PostASG15 era. The result is visualized below.

Change in Pitch Location Density After 2015 All-Star Game

Change in Pitch Location Density After 2015 All-Star Game

The changes here are again rather clear. The reduction in called strikes on the low-outside corner has induced pitchers to throw to that location less often. That has moved pitches inward toward the plate, and the inner half is being targeted more often than it was in the PreASG15 era. These changes are pretty subtle at an extra pitch or two per game.

But given the increase in pitches on the inner half of the plate, we should see more swings and contact there as well. Again using kernel density visuals, this is evident in the data.

Change in Swing Location Density After 2015 All-Star Game

Change in Swing Location Density After 2015 All-Star Game

Change in Contact Location Density After 2015 All-Star Game

Change in Contact Location Density After 2015 All-Star Game

The net effect here is around one additional contacted pitch on the inside half per game, and an average contact point between one tenth and one quarter of an inch closer to the center of the plate. Should we expect these small changes to result in home run increases?

As I did with the probability of a strike call, I use a GAM to estimate the probability of hitting a home run when the batter swings, conditional on the location of the pitch. And, as I suspected, the most common location for home runs aligns almost perfectly with the locational increases in pitches, swings, and contact in the PostASG15 era.

Probability of Hitting a Home Run on a Swing (GAM Predictions)

Probability of Hitting a Home Run on a Swing (GAM Predictions)

Given all the evidence here, my next step was to apply this in the context of Simpson’s Paradox. The changes in the rate of pitches in locations that result in higher exit velocities and home run rates, when averaged in aggregate, could result in what looks like an increase in how hard the ball is coming off the bat globally. In other words, if we reallocate the proportion of contact locations with the same associated exit velocities, can we explain the increased average aggregated exit velocity?

To do this, I broke the zone down into 36 separate six-inch by six-inch boxes as you see in the grid below. I then averaged the exit velocity of batted balls in each zone in the PreASG15 era and calculated a weighted average exit velocity using the PostASG15 proportions of contact in each zone. If the average exit velocity overall using the PostASG15era proportions paired with the PreASG15 era exit velocities, then we could conclude the increase is largely due to changes external to a juiced ball.

Generic Grid of Discrete Zone Breakdown

Generic Grid of Discrete Zone Breakdown

From this grid, we take the average exit velocity for Zone 1 (top left) in the PreASG15 era and multiply it by the proportion of contacted pitches in Zone 1 in the PostASG15 era. We do the same for Zone 2, Zone 3, and so on. These estimates in each zone are just a discrete version of the density plots (contact proportion) and GAMs (home run rate, or in this case, exit velocity).

And doing this did result in an increased exit velocity estimate. However, the change was only about 0.055 mph, or 5.5 percent of the actual change of one mph in the PreASG15 and PostASG15 eras. But I wasn’t satisfied with this. There could be other considerations. (I won’t go through the mathematical gymnastics here, but the percentage point change in inside rate is about 0.1 to 0.3 percent in a given zone area. I’m happy to share the specific numbers with anyone interested.)

It’s also possible the ball-strike count has become more favorable to batters, and in turn, batters are swinging at pitches more often in 3-1 or 3-0 counts. So while the locational differences didn’t result in much, perhaps batters simply are more prepared to sit on pitches in these counts and, in turn, hit the ball harder on average. Going through the same re-weighting exercise, I accounted for another 0.025 mph. That brings the total increase explained to only eight percent.

I remained stumped and a bit more open to the idea that the manufacturing of the ball may have changed slightly. But there are a few additional considerations that could be contributors.

It’s clear batters have been hitting balls at more favorable launch angles for home runs. The figure below is from a GAM estimation that evaluates the change in probability a batter hits the ball in the “sweet spot” angle for home runs. Clearly, some things have changed here, too.

Change in Probability in HR Sweet Spot Launch Angle After 2015 All-Star Game

Change in Probability in HR Sweet Spot Launch Angle After 2015 All-Star Game

There are two important takeaways from this plot. First, hitting the ball at these angles should increase the probability of home runs and, in turn, explain some of the increase in scoring. But Alan Nathan shows us that it’s not enough to explain the increase in home runs alone. Second, if balls are being hit at more favorable angles more often, it may also be that batters are squaring them up better. There is some evidence this may be the case, as the variability in exit velocity has been reduced overall in the PostASG15 era (though, by only about two percent) as measured by the coefficient of variation.

The skewness of the exit velocity distribution has also changed (again, very slightly) in a way that implies more balls are being hit harder, but the hardest hit balls are necessarily being hit harder. Again, this seems to indicate some change toward more consistent squared contact across the league. But it’s very small, and as Rob Arthur recently showed, some of this could be due to systematic missing data in the tracking system in the PreASG15 era.

With more detailed, reliable data, one might be able to “reverse engineer” contact quality beyond just exit velocity and launch angle. For example, the batted-ball spin rate and direction may tell us more about the swing plane and how squarely balls are hit. Unfortunately, that data are not publicly available, and I’m told the results are still not particularly reliable.

Ultimately, even in the reverse-engineering scenario, it’s not clear we could account for the entire change. And it’s also not clear why the entire league so suddenly would change the way it approaches hitting. My hope is Alan Nathan can enlighten us all in the coming months regarding the exit-velocity puzzle.

So, after nearly 3,000 lines of R code and sifting through hundreds of thousands of observations, I explained very little. But sometimes that’s the fun of scientific inquiry. If we always made clear discoveries that could explain what was happening, we would be deprived of the challenge that makes the inquiry fun in the first place. It will be interesting to see how both offensive output and research on exit velocity continues to unfold.

What I do find fascinating, however, is that there are apparently some changes consistent with pitchers responding to incentives. The same behavior took place when the strike zone worked its way downward. If umpires continue to change the pitches they call strikes in different ways, it should be fun to keep tabs on how this induces strategic changes among pitchers.

References & Resources


23 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
dominik
7 years ago

Good article, however regarding high strikes causing more FBs:

Nathan states that FB rate is not up but the increase is due to harder contact.

League FB rate:
2014: 34.4
2015: 33.8
2016: 34.5

FBs are clearly not increased even though anyone talks about hitting fly balls (on twitter the Motto of the Internet hitting Gurus is now “elevate and celebrate”:)).

Brian Mills
7 years ago
Reply to  dominik

Hi dominik,

There has been a pretty clear increase in the average exit angle, though this doesn’t necessarily mean there are more fly balls as categorized within the FB rate. It could mean that there are more line drives and less GB, or all the change could be happening within the already defined designations (going from 29 to 30 degree launch angle, etc.). So perhaps my statement wasn’t as precise as it should have been. As I noted in the article, the change doesn’t seem to explain the HR change. So I’m with you on that.

Jimmy Sweetman
7 years ago

Holy research paper, this article was fantastic!

I also noticed you have the same name as Liam Neeson in Taken. I am certain I’m not the first to comment on that.

Brian Mills
7 years ago
Reply to  Jimmy Sweetman

I used my very special set of R skills for this inquiry.

I fill find you.

And I will thank you for the kind words.

phoenix2042
7 years ago

“The skewness of the exit velocity distribution has also changed (again, very slightly) in a way that implies more balls are being hit harder, but the hardest hit balls are necessarily being hit harder. Again, this seems to indicate some change toward more consistent squared contact across the league.”

Did you mean “but the hardest hit balls are [NOT] necessarily being hit harder”? I just wanted to clarify. This was amazing work! I learned a lot (and yet, so very little, but as you say, that’s the fun of it).

Brian Mills
7 years ago
Reply to  phoenix2042

Yes! That’s a typo. There should be a “NOT” there.

MGL
7 years ago

Excellent stuff Brian! So it sounds like the average exit velocity in each section of the strike zone IS up on the average which is important information. IOW, I want to know if average exit velocity is up using the delta method per section of the strike zone. So, up and in, avg. exit speed before is 80 mph, after is 81, # pitches in that zone is 1000. Up and middle, avg. exit speed is 85 before and 87 after, 500 pitches. So far, we have (81-80) * 1000 + (87-85) * 500 / (1000+500). That’s what I mean by the delta method. That way we control for changes in location. I would also restrict to fastballs to control for changes in pitch frequencies. I would also restrict to certain counts and maybe even certain batters and pitchers or use the delta method holding each of those variables constant.

Brian Mills
7 years ago
Reply to  MGL

Hi MGL,

Thanks. Yes, it is up in most areas of the strike zone. Though the differences vary pretty considerably by zone.

All I did here was assume the same velocity as before, but with the new proportions (naive calculation). I do it precisely as you lay out here, but using the prior velocity rather than the change in velocity. Using the change should put us right at the aggregate velocity increase (I think…right?).

I think there is a lot of room for a further breakdown though (especially in thinking about fastballs, etc.). I didn’t get too deep into it since the estimated effect was so small (same with the count-based inquiry). But it might be interesting.

Guy
7 years ago

Really nice work, Brian. And a great blow against publication bias, since in the end you are (mostly) reporting a non-finding.

Two questions, and a theory:

Question: Does the reported 1 mph increase in exit velocity account for the big drop in sac bunts? If not, that might account for another 10-15% of the increase.

Question: Has anyone looked at hitters who have especially increased their HR% in past 1.5 seasons, to see if they share any particular characteristics, or have made any common change in hitting approach? It might be worth looking to see if they are more heavily shifted as a group, or have increased their strikeout rates or changed their swing trajectory. If they differ in any way, it might shed light on what is happening.

Theory: Like many others, I’m skeptical of any theory that relies on lots of hitters suddenly changing their approach/behavior at the same time. But if Alan is right about the ball, then given your findings we have to consider this possibility. Maybe a substantial number of hitters responded to the strike zone shift, along with the rise in defensive shifts (and maybe a slightly-juiced ball?), by consciously swinging more aggressively for HRs. That could mean a change in swing angle and/or velocity. If some hittters did this and were successful, others would quickly copy them. The fact that we are seeing more strikeouts, and more swinging strikes, in 2016 — despite a smaller zone — is consistent with the idea that some hitters are being more aggressive.

Brian Mills
7 years ago
Reply to  Guy

Thanks Guy. Good questions.

On #1, I remove all bunts from my data (foul balls, too). I was worried that might affect results. It doesn’t seem to do much for the overall exit velocity (though maybe I should double check, since my data shows a 1 mph increase, while others have reported a 1.5 mph increase…so maybe some of it is my removal of bunts? Depends on how others treated them. I think I get a smaller velo change than Alan because I use data from months with higher exit velos in general (summer months) in 2015 and compare to only pre-ASB games in 2015, which are mostly spring games).

On #2: I feel like I saw something on that with the other known change (the drop in the bottom of the zone). It’s not specific to the recent change, but I’m actually in the midst of working on an academic paper on heterogeneous effects of strike zone changes. I’m pretty preliminary on the analysis though (as in, still trying to merge some data sets).

Yes, I think approaches are evolving overall. Over on Twitter I posted a graphic where it shows it can be difficult to figure out whether things were a sudden shift or a gradual change. There is a clear change point (post ASB 2015), but it’s not clear if it’s a level shift or a trend shift (evidence depends on the measure you’re looking at).

Maybe Alan can chime in. I think he’s doing some work on this swing plane/contact quality stuff. I’d love to be able to try and get data for this “reverse engineering” idea to see approaches with swings.

Steve
7 years ago

Amazing work, great visuals

Don Smythe
7 years ago

Really nice work, Brian.

But one other possibility (I believe Bill James proposed this theory in his second Abstract). Is one reason for the uptick in home runs because it’s gotten more difficult to put together a sequential offense (i.e. it’s easier to swing for the fences instead of stringing together three singles)?

IIRC, James (writing during one of the great hitters’ eras), wrote a few graphs suggesting that if baseball raised the mound to dampen hitting, the result would be lower batting averages and more hitters swinging for the fences. A bigger strike zone would likely have the same effect: When it’s harder to link hits together, it’s a better investment to swing for the fences.

Brian Mills
7 years ago
Reply to  Don Smythe

I honestly have no idea if that’s the case, but that’s a great behavioral economics idea! Seems worth looking into. I’d love to do so at some point.

dominik
7 years ago

I do believe that there now is more emphasis on getting the ball in the air, even among small middle IF types. they used to only “allow” big guys to try to hit homers but speedy guys were told to hit the ground.

but now even smaller guys are allowed to trade a few hits for more power, for example trea turner said he wants to hit flies even though his coaches said to hit it on the ground and run. also the twins were slammed for telling buxton to hit it on the ground.

there is a changing mentality but that doesn’t explain why it happened over night. basically pre 2015 ASG the HR were still normal and then they exploded.

a league wide Evolution would have taken at least 4-5 years.

also there was the theory about powerfull rookies (correa, bryant, schwarber, sano…) and that certainly was true (outstanding rookie class offensively) but also a lot of old hitters had big HR seasons (Pujols, Cruz).

Tim
7 years ago

Very informative article. What R package(s) did you use for the GAMs and kernel density estimates?

Brian Mills
7 years ago
Reply to  Tim

I estimate GAMs with the mgcv package with bam() and the parallel package, and two dimensional kernel density with kde2d. For the visuals, I do some combinations of filled.countour with RColorBrewer.

Some info on my old blog here on the estimation of GAM models for strike zones: http://princeofslides.blogspot.com/2013/07/advanced-sab-r-metrics-parallelization.html

Brian Mills
7 years ago
Reply to  Brian Mills

I should note that my code for visualization is adapted very much from earlier work by Dave Allen.

Eric
7 years ago

Lovely analysis, and great visuals. Thank you!

But are we certain this isn’t due to chance? You could simulate data by bootstrapping from the population of all the pitches, and see how often we see a change like this one, in called balls and strikes, between the simulated pre-ASB2015 pitches and the simulated post-ASB2015 pitches. First, you could eyeball the heatmaps of differences, to see if the random simulated ones often give rise to what look like meaningful patterns; and maybe a crude calculation, what proportion of the time do we see this extreme a change in proportion of called strikes above the middle of the strike zone, minus (or over?) proportion of called strikes below the middle of the strike zone. Just a first thought.

Brian Mills
7 years ago
Reply to  Eric

Thanks Eric.

That would certainly be possible, and these are small changes. Since the original inquiry was to evaluate whether the locational change (whether random or systematic) would be enough to explain the exit velocity changes. Doesn’t seem that they are.

The robustness of the changes for both RHB and LHB give me a bit more confidence (not just a shift in one direction due to a measurement change in the data) in the data moving in a direction consistent with the apparent zone changes. But it’s worth taking a more rigorous probabilistic view on the location changes themselves, for sure. Data are available, and I would encourage anyone and everyone to try different methods to see if my results here hold. Maybe it’s random.

pft
7 years ago

LOL. The hoops people jump through to avoid the juiced ball theory. Sometimes the simplest explanation is the best explanation. Occam’s Razor. The HR/G rate in the AL is the highest ever recorded in the AL in history including the height of the steroid era (also believed by some to be juiced ball era as well) . HR’s are up 15% over last year and 30% over 2014. B ABIP is up as is exit velocity, so its not all HR’s.

Even Nathans data showed exit velocity was up for all batted ball types. His one comment that cast doubt in his mind on the juiced ball theory was in balls hit from 0-10 degrees showed a smaller increase than he expected, but even this data showed an increase. The 538 presented a far more convincing case for the juiced ball.

Sure the strike zone may be up a bit, but explain how guys and teams who have always lived up in the zone have been affected the most by a jump in HR rates where they are getting more calls, especially in the AL where the HR rate seems more pronounced. More of their Fly Balls are going for HR than before. Pitchers who have lived down in the zone are still having good years despite not getting as many calls, because these guys tend to get batters to chase more out of the zone (eg Tanaka).

The data in the NL does not seem as extreme for whatever reasons, maybe pitchers dampen that for some reason.

We all know that the MLB spec for balls is so wide that at the upper end of the spec balls will travel 40-50 ft further in the 400 ft test than balls at the lower end of the spec. Even w/o a conspiracy to juice the balls, its not unheard of for manufacturing variations due to equipment or material changes (balls are made in Costa Rica) some of which might be out of the manufacturers control, to account for some annual variation in quality and some drifting of the COR (up or down) from year to year. Since the balls would still be within MLB spec, there is no breach of rules (unlike in Japan). I suspect big jumps in HR rates 1987 and the late 70’s were do to changes in the ball manufacturers or locations (Rawlings made exclusive in the 70′ and moved from Haiti to Costa Rica in the late 80’s). So this wouldn’t be the first time.

I am sure MLB does extensive testing since every company that orders large quantities of anything has them tested regularly, so they would know where the balls lie within the spectrum of their tolerances. I am sure the tolerances they specify in their orders are much tighter than the rules. What they are, nobody knows. Journalism is dead in this country. In any event, no transparency.

Maybe that Hampel guy who collects all the foul and HR balls can send them out for testing and report on it. His book on balls is one of the more illuminating things written on baseball equipment which MLB seems to like to keep us in the dark about.

Also, looking at some of the H-A team splits, I do wonder at how well MLB’s rules to precondition balls at standard conditions is being enforced. Again, we have little transparency at how MLB enforces this and how teams are conditioning the balls before games.

Like with all things in life, a complete explanation may be multi-factorial, but I am sticking with the ball as the most likely explanation

Brian Mills
7 years ago
Reply to  pft

“Sure the strike zone may be up a bit, but explain how guys and teams who have always lived up in the zone have been affected the most by a jump in HR rates where they are getting more calls…”

I can’t explain much of the change in HR at all. That’s the entire point of the article.

Here, I’ll CliffsNotes it for you:

“So, after nearly 3,000 lines of R code and sifting through hundreds of thousands of observations, I explained very little. “

evo34
7 years ago

I’m not claiming to be able to explain the apparent suddenness of change, but can anyone remember a time in the AL with so few young stud starting pitchers? The top 3 under-26 AL SPs are probably Aaron Sanchez, Michael Fulmer and Marcus Stroman. Off-hand, that trio seems historically weak — even adjusted for the higher offensive environment (if there is one). And the AL minor league pitching talent is practically non-existent.

I.e., I could see AL run scoring increase even more next year.

Pre Order Jordans
7 years ago

Look for the Nike Air Foamposite One Eggplant at select Nike stores and online during July 2017 for a retail price of $230. Keep it locked to KicksOnFire for more updates.