# Squaring it up

Patience is bitter, but its fruit is sweet.

– Jean-Jacques Rousseau

Usually, when we think about plate discipline, we focus on the
ability to draw walks. We look at things like walk rate or
strikeout-to-walk ratio and we remind ourselves how walks are very
valuable, that batting average, which ignores walks, is a flawed
statistic, etc, etc, etc.

But another aspect of plate discipline isn’t talked
about so much—not only do you walk more often when you “lay
off the bad ones,” but balls in play turn out much better if they were
initially headed toward the strike zone.

That makes intuitive sense, but can we quantify
how much better it is to swing
at strikes than balls? This is not as easy at it sounds,
because generally if a batter swings at a pitch, we don’t know if it
would have been a strike or a ball. However, we now have precise pitch location
available from the pitch-f/x data, and we do know if a pitch would
have been a ball or strike and so we can begin to see how much
advantage there is in swinging at pitches within the strike zone.

Speaking of the strike zone, as I showed in a previous article
( href="http://www.hardballtimes.com/main/article/the-eye-of-the-umpire"
target="new">The Eye of the Umpire), the actual strike zone
called by major league umpires differs significantly from the strike
zone as defined in the rulebook. For this study, I’m going to use the
real strike zone, as I measured it, and not the fictitious rulebook
zone. This is crucial: The actual strike zone
is the one implicitly understood by umpires, batters and
pitchers. It’s the relevant strike zone for an analysis like this one.

And one quick caveat, before we start digging into the good
stuff: As I wrote in the above article, some fraction of
pitches, perhaps 5-10%, are mistracked by the pitch-f/x system and
the location data for these are not dependable. Since we do not yet
have a way to know which pitches are affected, they are necessarily
included in our data sample. This means there is some small degree of
uncertainty in the numbers I’m going to present below, but the general
conclusions will still hold. OK, let’s get going.

###### Making contact

The first thing I looked at was swing-throughs: how often a batter
swings and misses when the ball is in the strike zone or not. The
following table, which is based on roughly 43,000 pitches that batters
swung at, shows how often a pitch was missed, hit into foul territory
or put in play:

```Swing outcomes
+--------------+--------+--------+-------+--------+
| inStrikeZone | swings | missed | foul  | inPlay |
+--------------+--------+--------+-------+--------+
| No           |  14053 |  0.327 | 0.356 |  0.317 |
| Yes          |  28744 |  0.127 | 0.389 |  0.484 |
+--------------+--------+--------+-------+--------+
```

So, a batter’s chances of swinging through a pitch are about two and a
half times higher when the pitch is a ball (33% compared to 13%). I
guess I expected to see a large difference, but I didn’t know it would
be this large. Batters foul off a few more pitches that are in the
strike zone, but obviously the big benefit is all those extra balls in
play: 48% of swings on pitches in the zone, only 32% on balls outside
the zone.

That’s the general trend, but it’s more fun to look at specific
batters, isn’t it? First let’s
look at the batters who swung most often at pitches out-of-zone:

```Batters who swing at many bad pitches
+------------------+--------+-----------+--------------+------------+
| batter           | swings | ball_frac | missedStrike | missedBall |
+------------------+--------+-----------+--------------+------------+
| Anderson_Garret  |    158 |     0.544 |        0.056 |      0.314 |
| Crawford_Carl    |    133 |     0.489 |        0.191 |      0.308 |
| Diaz_Matt        |    180 |     0.478 |        0.128 |      0.279 |
| Phillips_Brandon |    104 |     0.471 |        0.127 |      0.306 |
| Thorman_Scott    |    214 |     0.463 |        0.157 |      0.364 |
| Erstad_Darin     |    148 |     0.459 |        0.063 |      0.206 |
| Kemp_Matt        |    113 |     0.451 |        0.242 |      0.451 |
| Francoeur_Jeff   |    325 |     0.443 |        0.193 |      0.264 |
| Suzuki_Ichiro    |    381 |     0.428 |        0.046 |      0.153 |
| Soriano_Alfonso  |    166 |     0.428 |        0.189 |      0.282 |
+------------------+--------+-----------+--------------+------------+
Legend:
ball_frac = fraction of swings at pitches outside the strike zone
missedStrike = fraction of misses on balls in strike zone
missedBalls = same for balls outside the strike zone
```

I don’t see any real suprises here; most of these guys are known
free swingers. It’s interesting that many make contact on
balls better than the average major leaguer, which somewhat
compensates for their free-swinging ways. Indeed, it probably partially
explains why they swing so often. This is especially true for
Ichiro, whose contact rate on balls out of the strike zone very nearly
matches the MLB average on balls in the zone. And when the ball is in
the strike zone? Well, Ichiro swings through only one in 20 of
those. The players who are best at making contact are, for
balls in the strike zone, Juan Pierre (3.5% missed) and Brian Roberts
for balls outside the zone (6.6% missed).

And now let’s look at the guys who tend to get a good pitch to
hit:

```Batters who swing at few bad pitches
+-----------------+--------+-----------+--------------+------------+
| batter          | swings | ball_frac | missedStrike | missedBall |
+-----------------+--------+-----------+--------------+------------+
| DeRosa_Mark     |    109 |     0.174 |        0.133 |      0.316 |
| Vazquez_Ramon   |    125 |     0.200 |        0.090 |      0.560 |
| Glaus_Troy      |    227 |     0.203 |        0.182 |      0.370 |
| Iguchi_Tadahito |    310 |     0.210 |        0.118 |      0.308 |
| Giles_Brian     |    190 |     0.211 |        0.053 |      0.250 |
| Lofton_Kenny    |    282 |     0.216 |        0.059 |      0.213 |
| Cust_Jack       |    202 |     0.223 |        0.318 |      0.600 |
| Cameron_Mike    |    336 |     0.229 |        0.143 |      0.584 |
| Quinlan_Robb    |    113 |     0.230 |        0.138 |      0.308 |
| Johnson_Dan     |    180 |     0.233 |        0.087 |      0.357 |
+-----------------+--------+-----------+--------------+------------+
```

These names were more surprising to me. I was expecting to see some of
the walks leaders here: Swisher, Hafner, Ortiz, those guys (see below
for word on Barry Bonds). There is a wide range of bat control
represented here, from good (Lofton, Giles) to average (DeRosa,
Iguchi) to terrible (Cust). Cameron and Vazquez are OK on balls in
the strike zone, but whiff mightily on balls outside the zone,
although small sample size is an issue for some of these batters.

This is a good time to remind you that the pitch-f/x data is not
complete: Many players are excluded from this analysis because the
system is not installed in their home parks. For example, I have only
39 Barry Bonds swings in my database, so he doesn’t make any of these
lists. So, these lists should really be considered not the best,
but some of the best.

###### Squaring up the strikes

But what about when the batter manages to put the ball in play? Once
he does that, does it matter whether the ball was inside or outside
the strike zone? How many times have we seen a guy like Vlad Guerrero
drive a ball that is six inches off the plate into an outfield gap, or
indeed over the fence? Should Vlad lay off those pitches?

Well, yes, he should. Here’s what happened on balls put into play for
pitches that were in the strike zone and pitches that were not:

```Outcome of balls put in play
+--------------+--------+------+------+------+------+------+------+
| inStrikeZone | swings |  H   |  2B  |  3B  |   HR | ROE  | Outs |
+--------------+--------+------+------+------+------+------+------+
| No           |   4454 | 1236 |  208 |   24 |   88 |   71 | 3147 |
| Yes          |  13922 | 4530 |  905 |   79 |  512 |  162 | 9230 |
+--------------+--------+------+------+------+------+------+------+
ROE - reached on error
```

and here are the rate stats:

```Rate stats for balls put in play
+--------------+--------+-------+-------+-------+--------------+
| inStrikeZone | swings |   AVG |   SLG | BABIP | HR_per_swing |
+--------------+--------+-------+-------+-------+--------------+
| No           |   4454 | 0.278 | 0.394 | 0.263 |        0.020 |
| Yes          |  13922 | 0.325 | 0.512 | 0.300 |        0.037 |
+--------------+--------+-------+-------+-------+--------------+
BABIP - (H-HR)/swings
```

Almost 50 points in batting average and more than 100 points of
slugging, that’s what swinging at strikes will get you. Home runs, in
particular, are much more likely if the ball is in the strike
zone. This is huge advantage, even greater than I thought it would
be. There’s no doubt that swinging at strikes gives the batter a better
chance to get good wood on the ball, or “square it up,” as modern
parlance has it.

The above results for balls in play are for the average batter in my
sample, but, as they say, the average human being has one testicle and
one breast. Vlad Guerrero is by no means an average player, so should any
of this pertain to him? Well, let’s drill down to the batter level,

```Vlad Guerrero
+--------------+--------+------+------+-------+-------+
| inStrikeZone | swings | H    | HR   | AVG   | SLG   |
+--------------+--------+------+------+-------+-------+
| No           |     54 |   14 |    2 | 0.259 | 0.426 |
| Yes          |    102 |   41 |    4 | 0.402 | 0.637 |
+--------------+--------+------+------+-------+-------+
```

Actually, Guerrero benefits from swinging at strikes more than the
average batter. Go figure. But hang on a second. If you are now shaking
your head and murmuring “small sample size,” then give yourself a pat
on the back. Fifty-four swings on pitches outside the strike zone is
not enough to be able to draw any conclusion. (Actually, the 102 swings within the strike zone is on the small side, too.)
The small sample size is going to be a problem
with any particular batter, simply because we haven’t yet accumulated
enough pitch data.

Here’s an example:

```Richie Sexson
+--------------+--------+------+------+-------+-------+
| inStrikeZone | swings | H    | HR   | AVG   | SLG   |
+--------------+--------+------+------+-------+-------+
| No           |     31 |   10 |    4 | 0.323 | 0.806 |
| Yes          |    101 |   23 |    7 | 0.228 | 0.455 |
+--------------+--------+------+------+-------+-------+
```

Do you really believe Richie Sexson gains 100 points of average and
350 points of slugging when he goes outside the strike zone? Me
neither. We just don’t have enough data at the individual batter
level, not yet. There may be batters who can handle balls out of the zone quite well (and maybe Sexson, with his long arms, is one of them), thereby justifying a very aggresive approach.
But we’ll have to collect more pitch data before we can say much about that.

###### Final Thoughts

Strike zone judgment is not just a matter of laying off the bad ones
to get more bases on balls. That is an important aspect, of
course, but the better outcomes on balls in play on pitches that are
in the strike zone, compared to pitches that are outside the zone, may
be just as important.

The batter who swings at more strikes and fewer balls is helped in two
ways: 1) he puts more of those balls in play and 2) the balls in play
go for hits and extra-base hits more often. (These benefits are in
addition to the better counts the batter gets by not swinging at
balls.)

Ted Williams, one of the most disciplined hitters ever, thought the
single most important aspect of hitting was “get[ting] a good pitch to
hit.” I like the positive way he phrased his point; he didn’t say
“don’t swing at the bad ones.” The emphasis was not on being passive,
but rather swinging hard at a pitch you could handle, a pitch in what
he called his “happy zone.” Williams’ “happy zone” actually referred to
smaller areas within the strike zone (see picture). But, an obvious
first step toward “getting a good pitch to hit” is to swing at
strikes and not balls.

One last thought: It’s interesting to consider what these results mean
for DIPS theory, which holds that to a large degree pitchers have
little control over how frequently balls in play go for hits. We saw
above that BABIP for pitches out of the strike zone is .263, compared
to .300 for pitches in the zone. If some pitchers were particularly skillful
at getting batters to swing at balls outside the strike zone, they
could show lower than expected BABIP.

0000
« Previous: The Value Production Standings:  1998-2001
Next: First impressions: Buchholz vs. Kennedy »