Data Erratum Redux

In yesterday’s article, Data Erratum Et Cetera, I noted the difference between the two leagues in BABIP and LD%, and wondered what might have caused it.

A number of readers and commentators mentioned that I overlooked the obvious — pitchers don’t bat in the AL. Doh! That is obvious. So I went back and ran my analysis a little differently.

This time, I only included batters with at least 40 plate appearances in either league (which I probably should have done in the first place). That excludes almost all pitchers at this time, but still represents 93% of all plate appearances in the NL, 96% in the AL.

Now, there is only a 10 point difference between the two leagues.

LD%    .183
BABIP  .292
Diff   .110

LD%    .176
BABIP  .297
Diff   .120

A couple of points:
- Taking out batters with less than 40 PA’s has very little impact on LD% (one point down in the AL, one point up in the NL). That’s a bit surprising, and probably important in some way.
- It has no impact on BABIP in the AL, but brings down BABIP ten points in the NL. That’s the pitcher effect.

The remaining 10% diff could easily result from a slight difference in fielders or ballparks, as well as sample size issues or sheer luck.

Print Friendly
« Previous: Updates Galore
Next: THT on the radio »

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *