It’s been called the War Zone, the place where batters and pitchers
square off, the territory that each side has to conquer if victory is
to be assured: the strike zone. This is how the strike zone is defined
in the rulebook:
The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the knee cap. The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.
Straightforward I guess, although certainly not easy for an umpire. How are they supposed to determine that mid-point between belt and top of shoulder, not when the batter takes his stance, but when he is “prepared to swing at a pitched ball?”
But the difficulty of calling balls and strikes is not exactly what I want to write about today. What I would like to address is this:
Are umpires accurately calling the rulebook strike zone or do they
have their own, unwritten, strike zone? We have all heard anecdotal
accounts of umpires calling strikes six inches off the plate and calling belt-high pitches balls.
In 2001, Sandy Alderson, then executive vice president of operations for MLB, initiated a campaign to get the umpires to call the
rulebook strike zone. He was especially keen to see the return of the
high strike, since virtually all pitches from the belt up were being
called balls. The implication of Alderson’s crusade is that umpires
were not calling the rulebook strike zone and needed to be
Have they improved? Are they now calling the strike zone according to
We can try to answer this question by actually measuring the strike
zone as called by major league umpires and compare that to the
rulebook strike zone. This is of more than purely academic interest: one of
the reasons given for the huge increase in offensive production in the
1990s is the ever-shrinking strike zone. We can’t now go back and
see if that claim is accurate or not, but we can measure the strike
zone today and in the future and see how it changes and how it affects
So, let’s roll up our sleeves and get to work.
Strike zone: actual vs. rulebook
We can actually measure the dimensions of the strike zone using MLB’s detailed pitch data,
which gives detailed location information on every pitch thrown in
the ballparks where the hardware is installed (about 80,000 pitches at
the time I’m writing this). We know for each pitch
the (x,y) position (horizontal and vertical) of where it crossed the
plate and we know whether the umpire called it a ball or strike. We
also know, thanks to the operator of the pitchF/x system, the lower
and upper limits of the strike zone for each individual batter.
I decided to split up the job into two pieces: first I’m going to see
how wide the strike zone is, then I’ll tackle the vertical size. The
first step is selecting pitches, either balls or called strikes, that
are safely within the vertical strike zone, i.e. they are clearly not
low or high. The picture on the right shows what I mean. Each pitch is represented by a
black circle, but there are so many of them they appear as the
black band in the plot.
The red rectangle represents the strike
zone. I’ve mapped the vertical location of each pitch to an average
strike zone that goes from 1.6 feet to 3.56 feet. This removes the
batter-to-batter variation and allows me to show a single strike zone
in the plots. Home plate is 17 inches wide, but the horizontal
dimension of the strike zone is actually just a hair under 20 inches
(1.66 feet), because a strike is called if any part of the ball crosses
over any part of the plate.
As you can see, these pitches are all well within the vertical strike
zone, so if they are called a ball, we can be fairly sure that
they are considered either inside or outside by the umpire. Note that
I’ve avoided the upper part of the strike zone, just in case the umps
are calling that pitch high.
Next, I divide the data into bins of horizontal
position and see how many balls and strikes are called at each
position. As is often the case, it’s easier to show the results on a graph than to
describe it in words.
Lots going on here, so let’s take ‘er slow. The top plot shows the
number of balls (red) and called strikes (blue) called at each
horizontal position. This (and all other plots) are shown from the
catcher’s viewpoint. We see lots of strikes and few balls in the middle of the
plot and just the opposite away from the center, which is what we expect, of
The bottom plot shows the fraction of called balls as a function of
the horizontal position. If the umpires and the pitch location data
were perfect, we’d expect the ball fraction to be equal to zero within the
rulebook strike zone (depicted by the vertical red lines) and equal to
one outside the red
lines. We see this general behavior with a few key differences.
For one, there are some balls called even when the data say the ball
is right down the middle, both vertically and horizontally.
These are either totally blown calls or
faulty data or a mixture of both (more on this point later). Secondly,
we don’t see an abrupt change where the ball fraction changes from
zero to one, but rather a continuous curve. That’s to be expected, of
course, since neither umpires not pitch measurings systems are
We can still define the width of the strike zone, however, even though
there is no sharp transition. We somewhat arbitrarily, but
reasonably, say that the edge of the measured strike zone corresponds
to the position where the ball fraction is one-half. On the graph this
is shown by the vertical green lines, whose positions are defined by
where the horizontal green line crosses the blue curve (or,
equivalently, where the blue and red curves cross in the upper plot).
The horizontal width of the measured and rulebook strike zones
are shown by the green and red arrows. It appears from this data
that the umpires’ strike zone is about two inches too wide on each
side, compared to the rulebook strike zone. Oh, all this is for
For left-handed batters, things are a bit different, as you can see in
When a lefty swinger is up, the strike zone extends further out on the left side
(outside to lefty), while it just about right on the inside part of the
plate. The extra width on the outside amounts to about 4.5 inches. I
suspect that umpire positioning is responsible for the difference
between right- and left-handed batters, but I suppose it could also be
something in the data.
I’ve used the same procedure to measure the vertical size of the
strike zone. First, balls and called strikes that are horizontally
centered on the plate are chosen, then I look at the fraction of
called balls as a function of the vertical location of the pitch. Here
are the results for right and left handed batters:
Now the horizontal axis shows the height of the pitch instead of its
horizontal location. The exact vertical position of the rulebook strike zone
is different for each batter, but, as mentioned above, I’ve mapped
each batter onto a standard vertical strike zone that goes from 1.60
to 3.56 feet.
Now we see that the umpires are calling a smaller vertical strike zone than the
rulebook calls for. For right-handed batters, the upper edge of the
zone is called correctly, while umps are not calling the lowest
strikes. That doesn’t surprise me — whenever I see a replay of
a low pitch from the side of the plate, i.e. so you can judge the
height relative to the batter’s knees, I’m often surprised that what I
thought was a very low pitch is actually about knee-high.
Umps have the same problem calling low strikes for left-handed batters
and they also call fewer high strikes against lefties.
to note about these plots of the vertical strike zone: there is less
accuracy in calling the vertical strike zone compared to the
horizontal one. This is seen by noting how rounded the blue curves
are for the vertical dimension: the transition from strike to ball is
much less sharp than for the horizontal case. This should not be
surprising since judging the vertical strike zone, where the umpire
has to estimate “the mid-point between the top of the pants and the
top of the shoulder,” is more difficult.
We can summarize the findings on the actual strike zone with the
following plot and table:
Actual vs. Rulebook Strike Zone Dimensions (inches) Left Right Lower* Upper Total Area+ RHB -12.0 12.1 21.6 42.7 509 LHB -14.6 9.9 21.2 41.0 485 Rulebook -9.9 9.9 19.2 42.7 465 * vertical strike zone mapped to average + total area in square inches
Fly in the ointment?
I’ve already mentioned the fact that the ball fraction for pitches
right down the middle of the plate is not zero, in fact it’s about
5-6%. Can umpires be missing these easy calls so frequently?
It seems hard to believe. The alternative explanation is that there is
some problem with the data.
I believe we can settle this issue by
studying carefully the data, but a detailed investigation will have to wait for another day.
As a first look, I have watched video of a half-dozen of these pitches that were
measured to be right down the pipe, but were called balls.
What I found was this: a few of these were clearly outside the strike
zone, a couple were borderline, none were right down the middle. And if you doubt
my pitch-calling ability using mlb.tv on my little computer screen,
all I can say is that one pitch whose recorded location was right in the heart of the strike zone,
was actually an intentional ball that
was thrown two feet off the plate!
I am currently working on understanding this issue in more depth. Right now it appears that 1) this is an issue with the quality of the data and 2)
it will not materially affect the strike zone measurements I’ve presented here. I hope to have more information on this next time.