The strike zone advantage for the home teamby James Gentile
September 13, 2013
Last week I saw a tweet that referenced findings from the 2011 book Scorecasting which found an overwhelming strike zone advantage for the home team. This is something that fans have long suspected to be true, and as a consequence there have been a number of studies investigating the phenomenon.
In fact, just prior to the release of the book, THT's own John Walsh did some work for the 2011 Hardball Times Annual that dealt with this exact topic. John's article touched on a number of situations where the size and shape of the strike zone are likely to fluctuate (for more on this topic read Jon Roegele's recent piece at BP). This included evidence of a clear, but limited advantage for home team pitchers. John found that "the strike zone is about 2.5 percent smaller for the visiting team," and ultimately accounted for "a little more than a third of the home-field advantage."
Jesse-Douglas Mathewson also looked into Scorecasting's results during his time at Beyond the Box Score, and found that "the home team undoubtedly has an advantage when it comes to the strike zone, but this advantage just isn't very big." Jesse also estimated about a 2.4 percent advantage for home field pitchers.
So, we've been shown in many ways our assumptions are correct—Umpires do show a bias towards home team pitchers. But at just two and a half percent, this effect seems much smaller than some of us might have guessed.
Now, I haven't read Scorecasting, so I apologize to its authors if I'm mischaracterizing their findings (I have since ordered a copy, however). But I did read that their conclusions were drawn from the discovery that the advantage was most prominent when the game was experiencing its most critical moments. In other words, Umpires were more biased towards the home team when the game was on the line.
To this point, I found a quote from the book referenced at Phil Birnbaum's Sabermetric Research blog:
"If the umpire is going to show favoritism to the home team, he or she will do it when it is most valuable -- when the outcome of the game is affected the most."
So I decided to attempt to illustrate this alleged leverage-influenced advantage by breaking down the strike zone across various game states, mostly just to quench my own curiosity on the matter.
I started by using Retrosheet data from 2002-2012 (only because it was immediately accesible) and looked at called strikes per all pitches 'taken' (or non-swings) for both home and away pitchers. Overall, home team pitchers saw a marginally better rate of called strikes per take at 32.3 to the visitor's 31.7 percent-- significant, thought perhaps not exactly unearthing evidence of a grand umpire conspiracy.
But sure enough as I filtered the data adding more and more critical game states, I found that advantage did increase noticeably the more crucial the situation:
|Split||PA||Home Str/Take%||Away Str/Take%||Difference|
|<=1-run game, 9th inn||52904||31.3%||30.3%||0.92|
|Tie game, 9th inn||21405||30.3%||29.3%||1.09|
|Tie game, 9th inn, RISP||6166||27.3%||25.2%||2.11|
|Tie game, 9th inn, Bases loaded||798||29.7%||27.2%||2.48|
What is generally an advantage of just half a percentage point on average, baloons slowly but surely as we focus our attention on more critical late and close scenarios. Then, as we require more base runners in a tie game in the ninth inning, we see that advantage grows even further.
Naturally, there are a number of things that can create some noise here. Obviously, this isn't controlling for the count (which certainly affects the size of the zone), umpire, pitcher, or batter-handedness. But mostly this technique is less than ideal because we aren't looking at an actual change in where strikes and balls are called, but rather just an estimate. To find bias in the actual size of the strike zone, we would need to look at PITCHf/x data.
So, using pitch location data back until 2007, I broke down home and away pitches into two categories: the percentage of pitches outside the zone that were called strikes (OZCS%), and the percentage of pitches inside the zone that were called balls (IZBall%). (The 'zone' defined here is the so-called Mike Fast zone.) I also limited most of this query to 0-0 counts, to eliminate any potential bias with regard to counts, and excluded intentional balls.
As you might expect, the splits did mirror the Retrosheet findings:
|Split||avg LI||Count||Takes||Home advantage in IZBall%||Home advantage in OZCS%|
|1-run game, 9th inn||2.7||0 0||23378||-2.63||1.10|
|Tie game, 9th inn||2.7||0 0||9434||-4.44||1.17|
|Tie game, 9th inn, RISP||4.3||0 0||2133||-3.31||2.13|
|Tie game, 9th inn, Bases loaded||6.4||0 0||304||-6.53||3.62|
Naturally with just under five seasons of data, by the time we get to tie game, ninth inning, bases loaded situations, the sample size has thinned dramatically, with just over 300 taken pitches to observe. Nevertheless, the trends here do suggest support for the theory that umpires substantially ramp up their bias with the game on the line. As we trek upwards along the LI spectrum, we see the home team gets more called strikes outside the zone and fewer balls called inside the zone.
At our peak leverage state—with an average LI of 6.4—the home team sees 8.5 of their pitches outside the zone called for strikes, while the visiting team's pitchers got the benefit of the call just under five percent of the time. Add to that a six and a half point difference in balls called inside the zone, and we have ourselves evidence of a major game-changing bias.
Ideally, we'd like to use more than just 300 pitches to determine whether umpires need to curb their affection for the home crowd, but that will have to wait until more data from future seasons comes in. But this should hopefully serve as a fair starting point for such an endeavor.
Curiously, when I expanded this uber-critical tie game, ninth-inning, bases loaded game state to include extra-innings as well, the advantage for OZCS% increased yet again to a 4.5 point gap, while the advantage in IZBall% retreated back to 3.3 point difference (with a sample size of 758 takes). These numbers could certainly be jumping around because of noise within the limited sample, but it is a hunch of mine that extra-innings feel less dire than the ninth to both fans and umpires, despite the similar LI scores. (Am I alone in having this impression?)
We can only speculate as to why this home field advantage becomes so exacerbated in heightened conditions. Presumably, as the crowd gets louder, they become more persuasive and old 'Blue' becomes less apt to disappoint them. I'll choose not to attempt to impress you with reckless insight from the hazy remains of just one or two psych 101 courses, but I would like to open up that conversation: What is motivating the umpire to please the home crowd?
Is it the fear of enraging thousands of people? Is it the allure of igniting an entire stadium with roars of delight after one pump of your own fist?
I'll admit this inquiry leads to a lot more questions than answers, and some of those questions I'd like to pursue in the future. Such as, are certain umpires more vulnerable to this impulse and can we identify the greatest abusers?
I'd also like to investigate whether Umpire bias increases when the outcome of the game will have a larger effect on the season. If umpires are more likely to please the home crowd when the in-game situation is dire, will they therefore grant the home team even more of an advantage if there are post-season implications at stake? Or what about the post-season itself? Is there an even larger bias in favor of home teams in these hyper-critical games—including the World Series?
References and Resources
Thanks to Retrosheet and Fangraphs.
James Gentile writes about baseball at Beyond the Box Score and The Hardball Times. You can follow him on twitter @JDGentile