Americans Defeat Nationals in Pitchers’ Duel

by John Walsh
January 11, 2007

It’s almost common knowledge now that the American League is currently
playing at a higher level than the National League. Whenever a player
changes leagues, commentators
remark on the improvement in stats we can expect if the player is
moving to the NL (see any commentary on the Barry Zito signing).
Of course, players moving to the AL are widely
perceived to be headed for a tougher playing environment.
In fact, noted sabermetrician Mitchel Lichtman wrote an
excellent
three-part
series
on the subject here at the Hardball Times
last summer. Mitchel presents a detailed analysis that uses several
methods that are sensitive to any difference in the quality of the
American and National Leagues.

Mostly we think of the AL as the better hitting league—part of
that is due to the designated hitter-rule and the fact that we are used to run
scoring being typically a half-run higher per game in the AL. But,
even beyond the designated hitter, you often hear about how tough AL lineups are from
top to bottom; that it’s not just the designated hitter, but that the AL has more good
hitters. Mitchel’s study bears that out, but I have done some
additional investigation that shows that the American League has been
better in pitching and defense than the National league in recent
years as well.

Short Stories

In the winter of 2005, Scott Hatteberg stood at a crossroads of
sorts. About to turn 36 years old, the Oakland A’s first baseman/designated hitter
had not been offered arbitration by his (former) team and was a free
agent. Trouble was, he was coming off his worst year in recent memory:
his .256/.334/.343 line was not going to cut it for a first baseman/designated hitter-type. He had lost playing time in his last year in Oakland as well,
appearing in 134 games, his lowest yearly total since leaving Boston
as a part-time player in 2001.

There was speculation the Hatteberg might just hang up the ol’ spikes
and seek a position in somebody’s front office. But the Reds were in
need of a first baseman and they took a low-risk flier on Hatteberg,
offering him a one-year contract for just $750,000. Hatteberg had an
unexpectedly fine 2006, producing a line of .289/.389/.436. Hatty was
probably thinking “NL, where have you been all my life?”

Antonio Perez also underwent a big change in the 2005 off-season. The
young Dodgers infielder had enjoyed a solid season as a utility
infielder: playing in 98 games, mostly at second and third base, the
(then) 25-year-old put up a very nice line of .297/.360/.398.
However, he would not get a chance to break into the Dodgers infield,
because he was soon traded to the A’s in the Milton Bradley/Andre
Ethier deal. The A’s infield was much more set, so Perez’s playing
time in 2006 was cut way down. (Had he been able to play shortstop, he
may have gotten more playing time since Bobby Crosby was injured for much of the
season.) And when he was in there, Perez did not build on his solid
2005 campaign. Indeed, he hit about as poorly as possible, going
.102/.185/.204 in 109 plate appearances. Welcome to the new
neighborhood, kid.

Scott Spiezio is our third player to change teams in 2005. After
spending his whole career in the AL, he moved to the Cardinals for the
2006 season. In his two previous seasons, spent with the Mariners,
Spiezio came to the plate 466 times and “produced” to the tune of
.198 /.270/.324. How did he do when playing his age-33 season in the
NL? Just .272/.364/.496, that’s all.

Measuring the Relative Strength of Pitching

What do these three little stories mean? By themselves, nothing, of
course. I’m sure you could name a few players who have hit much better
in the AL than in the NL. Actually, here are a few right here:
Kevin Mench, Julio Lugo, Mark DeRosa and Jim Thome. There could be any number of reasons for
a player to play better in one situation than another; the change of
leagues may have nothing to do with it. Maybe Hatterberg’s 2005 was just
an off year and his bounceback in 2006 was just a return to his
established level of play. Maybe Perez never got used to
his reduced role and his hitting suffered. Maybe Spiezio was finally
healthy, which allowed him to excel. In all these cases, the sample size is far too
small to draw any conclusions about the quality of the two leagues.

What if we take all these players, though, all the hitters who have
played in both leagues recently and see how they did collectively?
Might we be able to coax out of the data some measure of relative
league strength? Yes, we can. Read on for the details.

The analysis is basically quite simple—I select players who
have played in both leagues in the period from 2004 to 2006 and I compare
their collective hitting performance in the two leagues. If the group
hit better in the NL than in the AL, we can infer that the pitching in
the National League is not as good as it is in the American League.
That’s the basic outline — it’s very simple, right?

There are a
couple of additional details you should know. One simple one is that I
require the players in my sample to have at least 100 plate appearances in each
league in the three-year period. One hundred plate appearances is not many, but I
wanted to include players that may have switched leagues at the
trading deadline. Another thing to note is a key assumption:
this method assumes that the average park in the NL
does not favor hitters over the average AL park, and vice versa. Standard park factors compare
parks within a league—we have very little information on how AL parks compare to NL parks, simply
because of the low number of interleague games. My intuitive feeling is that the assumption is a reasonable one,
but it should be kept in mind, nonetheless.

Itinerant Workers

The first thing that surprised me when starting to look at the data
was the sheer number of players who have played in both leagues. For
example, between 2004 and 2006 the number of players with at least
100 plate appearances in both leagues is 120. Whoa! That’s more than I
expected. That’s the highest number for any three-year period going back
to 1900, but there were plenty of guys switching leagues going back to
around 1960.

The graph on the right shows
shows the number of two-league players
since 1900, with each point representing a three-year period. Aside from
the mixing of players due to the formation of the American League in 1901,
there weren’t many player exchanges between leagues until the advent
of expansion in the 1960s. (I have not looked at mixing with the
Federal League, which is why this graph shows no big spike in the
1914-1915 period.) From the ’60s on the number of two-league players has
grown fairly steadily, especially after the advent of free agency in the mid-70s,
and nowadays we generally have around 100 players who
have gotten at least 100 plate appearances in both leagues in a three-year period.

AL vs. NL

So, how did our pool of hitters do in each league? I have determined
each player’s OPS in each league without any corrections at all. No
park adjustments, no corrections for aging, no league offensive
context normalization—none of that stuff. There are three
reasons for not adjusting for these effects: 1) each adjustment
introduces some uncertainty of its own, 2) with a large number of
players and plate appearances (as we’ll see in a minute) things like park
factors and aging effects will tend to cancel out, and 3) I wanted to
keep the analysis as simple and comprehensible as possible.

The results are most easily visualized by the graphic on the right,
where each point represents a single two-league player. The vertical
position of the point shows the player’s OPS in the NL, while the
horizontal position is his AL OPS. The red dotted line corresponds to
equal OPS’s in both leagues. It’s evident that more than half the
points are above the line, meaning that the majority of batters did better in the
NL. In fact 74 players lie above the line and 45 lie below the line.
Mathematically-inclined
readers will wonder about the one remaining player: Troy Glaus had
exactly the same OPS (.885) in both leagues, so his point is right on
the line.

The Summing Up

Since 62% of the players hit better in the NL, it would appear that
NL defenses (pitching plus fielding) are inferior to their AL
counterparts. To quantify this difference, I calculated the
difference (NL OPS) minus (AL OPS) for each player. I then calculated
the average difference for the whole sample, combining the individual
differences with weighting appropriate to the number of plate appearances for each
player (see the Resources section below for details). This method
not only yields the average OPS difference, but also the (one standard deviation)
uncertainty on that number—so we know how much faith to put in the calculation.

The result for
the period 2004-2006 is that two-league players hit better in the
National League by .029 points of OPS. The result is significant: the
uncertainty (one standard deviation) is .008. In other words, the
probability that the true level of pitching/defense in both leagues
is actually equal in this period is 0.000015, i.e. pretty darn small. So we can say with
a high degree of certainty (keeping in mind, though, the assumption about AL/NL park effects) that
the AL had significantly stronger pitching/defense than the NL during the last three
seasons.

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

How Long (Has This Been Going On)?

Since the number of players switching leagues is fairly substantial
going back almost 50 years, it’s possible to use this method to
evaluate the relative strength of pitching/defense in the two leagues
going back to 1960. Keep in mind, though, that the number of players
in each three-year sample is decreasing as we go back in time, so the
statistical reliability of the results will decrease as well.

The graphic on the right shows the quantity (NL OPS) minus (AL OPS)
for two-league players in each three-year period going back to 1960.

The dashed red line is “zero”, where the two leagues are of equal
strength. When the points lie above the red line, it means hitters
performed better in the NL, hence the AL pitching/defense was
stronger. The light-blue shaded band represents the
one-standard-deviation uncertainty on the data points. As advertised,
the thickness of this band increases as we go back towards 1960.

The graph shows that the AL has enjoyed an advantage in
pitching/defense for the last 10-15 years. The line is bouncing around a
bit, but it appears that the leagues had comparable pitching/defense
from the mid-70s until the early-90s. Before that, going back to
1960, the NL seems to have been superior, although the uncertainties
are getting pretty large. Still, the NL superiority in the 60s and 70s
overlaps well with their stellar record in All-Star Games. The National League actually
went 19-1 in All-Star Games in the period from 1963 to 1982 (and
back then it really did count, at least more than it does now).

For those of you who want to see the numbers, here’s a table of the
data that were used to make the plot (remember “Year” represents a
three -year period centered on the listed year):

Year   NL-AL OPS   1 SD
----   ---------   -----
2005     0.029     0.008
2002     0.011     0.009
1999     0.021     0.009
1996     0.019     0.010
1993     0.024     0.012
1990     0.004     0.012
1987     0.007     0.013
1984     0.001     0.014
1981     0.013     0.013
1978    -0.019     0.016
1975     0.011     0.017
1972    -0.005     0.015
1969    -0.010     0.018
1966    -0.024     0.020
1963    -0.045     0.019
1960    -0.020     0.018

Final Thoughts

How big is the .029 advantage in OPS that we found for 2004-2006?
Well, at the team level, an
increase of .029 in OPS would correspond to about 60 more runs scored
over the course of the season. That would increase the winning
percentage of an average team from .500 to about .540, (or six wins).

Of course, pitching/defense is only half of the story. This method
could also be used to evaluate the relative strength of hitting in the
two leagues by looking at pitchers who have pitched in both leagues.
There is a complication due to the designated hitter in the American
League—we expect pitchers to fare worse in the AL, even if the
leagues have similar offensive quality (excluding the designated hitter). One way to
take this into account, may be to exclude pitchers and designated hitters from the
analysis. Note that once we address the issue of offense, the assumption about park factors will
become irrelevant, since the sum of offense plus defense will be largely independent of such
effects.

I’m hoping that will be the subject of another article in
the near future.

References & Resources

Many thanks to THT colleague David Gassko for discussions on the methodology.
The data for this analysis were obtained using the 2006
Lahman Database.
The uncertainty (one standard deviation) on OPS is estimated with the following formula, which I derived using a simulation:
```
err(OPS) = 1.4/sqrt(PA)
```
The uncertainty on the difference between AL and NL OPS was calculated using standard propagation of uncertainty.

The average OPS difference of all players in each 3-year period was calculated using the following formula:
```
aveOPSDiff = Sum(wi * OPSDiffi)/Sum(wi)

where:

OPSDiffi is the OPS difference of a single player. 

wi = 1/(err_OPSDiffi^2) is the weight for the player.
```
The uncertainty on the average OPS difference is given by err_OPSDiff = 1/sqrt(Sum(wi)).

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG