Data Erratum Redux

In yesterday’s article, Data Erratum Et Cetera, I noted the difference between the two leagues in BABIP and LD%, and wondered what might have caused it.

A number of readers and commentators mentioned that I overlooked the obvious — pitchers don’t bat in the AL. Doh! That is obvious. So I went back and ran my analysis a little differently.

This time, I only included batters with at least 40 plate appearances in either league (which I probably should have done in the first place). That excludes almost all pitchers at this time, but still represents 93% of all plate appearances in the NL, 96% in the AL.

Now, there is only a 10 point difference between the two leagues.

NL:
LD%    .183
BABIP  .292
Diff   .110

AL:
LD%    .176
BABIP  .297
Diff   .120

A couple of points:
– Taking out batters with less than 40 PA’s has very little impact on LD% (one point down in the AL, one point up in the NL). That’s a bit surprising, and probably important in some way.
– It has no impact on BABIP in the AL, but brings down BABIP ten points in the NL. That’s the pitcher effect.

The remaining 10% diff could easily result from a slight difference in fielders or ballparks, as well as sample size issues or sheer luck.


Dave Studeman was called a "national treasure" by Rob Neyer. Seriously. Follow his sporadic tweets @dastudes.

Comments are closed.