Tuesday, January 13, 2009
BABIP’s relationship to hitters
Posted by Paul Singman at 1:20amPart of most everybody's method of hitter evaluation includes a look at the player's BABIP. "Were they lucky or unlucky?" is the typical question trying to be answered. Is the BABIP above or below league-average, their career BABIP, or some expected BABIP?
That's good and all, but today I want to discuss a new idea to consider when looking at BABIP. Here's my hypothesis:
Hypothesis
Some hitters are more reliant on their BABIP than others. Remembering that BABIP stand for Batting Average on Balls In Play, reminds us that BABIP only accounts for balls put into play—balls between the foul lines and in front of the fence. Some hitters, like Adam Dunn for example, rarely put the ball in play and therefore are not as reliant on their BABIP's as hitters who make their living punching the ball in play, such as Juan Pierre.
As you would expect, the Adam Dunn's of baseball have lower values in the denominator of their BABIP equation because, well, the bottom half of the BABIP ratio is balls in play and we established before that these hitters do not hit a lot of them. Lower values in the denominator means more fluctuation; think about it.
Let's say two hitters both have 100 hits and get their one-hundred and first hit in the same game (none of the hits are home runs). Hitter A is primarily a ball in play hitter and has 800 balls in play, while Hitter B is more of a true outcome hitter (because his plate appearances often result in what are known as the three true outcomes: walks, strikeouts, and home runs) and only has 400 balls in play. If you calculate the effect the added hit has on each player's BABIP, you will realize that it has a larger effect on Hitter B's BABIP, by .0007 to be more specific.
In this example the difference in BABIP appears statistically insignificant, but with real numbers perhaps a meaningful difference will be realized. Let's see if this theory—and remember it is still just a theory—hold true when tested by the numbers.
Methodology
With thanks to Derek, I was able to calculate the three true outcome percentage (3TO%) of all hitters in the major leagues who reached 250 plate appearances in 2006, 2007, and 2008. Taking the 2006 three true outcome percentage numbers, I compared it to the absolute value of the average difference in BABIP over the three years. My expectation is that high 3TO percentage hitters will have higher average differences in their yearly BABIP's than the low 3TO percentage hitters. The results:
Results
So Disappointing! My first statistical venture at the Hardball Times results in an r-squared of exactly zero! Absolutely no correlation between the two. Doing the same procedure for 2007 3TO percentage numbers results in an r-squared of .00001, and for 2008 3TO percentages an r-squared of .004, so clearly no meaningful relationship will exist no matter how many years of data I include.
Final thoughts
I thought I had a good idea on my hands—that hitters who put the ball in play less often would see greater fluctuation in their BABIP's—but apparently this is not the case. There is a slight, slight relationship but it is so insignificant that it should virtually be ignored. Lesson learned: an idea, no matter how logical in theory, needs to be tested before it can be accepted as truth.
If you are interested in the three true outcome percentage numbers for hitters with at least 250 plate appearances in the past three years, I'll make the database available here: 3TO_percentage.xls
Paul has been managing fantasy baseball teams for many seasons and writing for THT Fantasy over the past three years. He is currently a student at UPenn welcomes readers' thoughts at his email here or in the comments below.







Realistically, the (minimal) effect you’re seeing is largely due to a sampling concern - the more BIP you have, the closer to the mean your BABIP is likely to be. (This is why regression to the mean works - the more observations you have, the likelihood that a result is closer to the mean increases.)