What’s the best BABIP estimator?

BABIP is a stat that lots of people like to throw around but many don’t fully understand (even some who profess to be statistically inclined).

Background info

BABIP stands for Batting Average on Balls in Play. It measures the rate at which balls in play fall in for hits. Essentially, any ball that the batter makes contact with, puts into fair territory, and does not become a home run falls into the domain of BABIP. It is calculated as (H-HR)/(AB-K-HR).

We use BABIP to evaluate both pitchers and hitters, but the way in which we use it differs greatly among the two. Most pitchers regress toward the league average BABIP of around .300 or .305. Very few pitchers can repeatedly do better or worse than this, so we say that pitchers have very little control over BABIP.

Hitters, on the other hand, can have a substantial amount of control over BABIP. Ichiro Suzuki, for example, has a .356 career BABIP. Hitters do not regress toward league average, rather, they each regress toward their own, unique number.

The big question these days seems to be, what is that number? Today, I’d like to look at several ways of determining it and see which is best.

The test

This is something I’ve been curious about for a while, so I took as many BABIP estimators as I could think of and decided to put them up against each other to see which does the job of predicting the following year’s BABIP the best.

The combatants

  • Previous year BABIP (BABIP): This is simply the player’s BABIP from the previous year.
  • Expected BABIP (xBABIP): This is a BABIP model created by Chris Dutton and Peter Bendix, introduced at THT last month in this article. xBABIP is the primary reason for this article as I have been very curious how well our newest model does what it intends. Also, please note that Chris has tweaked the model a little since the original article ran. Please check the bottom of this article for more details.
  • Quick Expected BABIP (qxBABIP): As Dutton and Bendix’s xBABIP includes some stats that aren’t readily available to the casual fan, they’ve created a simplified version using stats that are readily available, of course, at the (expected) expense of accuracy.
  • Line drive BABIP (ldBABIP): This is the one that gets the most play. Everyone seems to be using it these days, but for reasons I’ve explained many times before, I’m not a fan. It’s calculated as line drive rate plus .120.
  • Studes BABIP (studesBABIP): This one was created around the same time Dave Studeman put out Line drive BABIP but doesn’t get nearly the same attention. Not a whole lot more difficult to calculate, but uses more than one variable. It’s calculated as 0.245 + 0.52 * LD% – 0.16 * FB% + 0.11 times K%.
  • Expected Batting Average BABIP (xBA BABIP): This one is the BABIP portion of Baseball HQ’s Expected Batting Average (xBA) statistic. I should note that this uses HQ’s SX stat, which I couldn’t replicate precisely. I was, however, able to get it very close. Also, because SX and PX are indexes based on league (American/National) average, for player’s switching teams mid-year, I weighted each based on games spent with each team.
  • Marcels BABIP (mBABIP): This isn’t so much an estimator as a projection, but I thought it would be good to include for context. It’s simply what Marcels projects for the following year. It’s also currently what I’m using in my True Batting Average calculations.

The process

I used data from 2004 to 2008, matching players from one year to the next. As xBABIP was the reason for doing the study, I had to work around that a little bit. xBABIP wasn’t calculated for anyone with fewer than 300 plate appearances, so I made that the cut-off for both year one and year two. There are some biases with using cut-offs, but there’s no way around it in this instance.

From there, I adjusted each stat for differences in league average and ran a couple of tests. You can see the results below.

The results

+---------------+-------+--------+---------+---------+-------------+-----------+--------+
| TEST          | BABIP | xBABIP | qxBABIP | ldBABIP | studesBABIP | xBA BABIP | mBABIP |
+---------------+-------+--------+---------+---------+-------------+-----------+--------+
| Correlation   | 0.38  |  0.50  |   0.45  |   0.20  |       0.32  |     0.40  |  0.46  |
| R-Squared     | 0.14  |  0.25  |   0.20  |   0.04  |       0.10  |     0.16  |  0.21  |
| Average Error | 0.028 |  0.021 |   0.022 |   0.029 |       0.024 |     0.022 |  0.022 |
+---------------+-------+--------+---------+---------+-------------+-----------+--------+

As you can see, there’s a pretty clear pecking order in these results:

+------+-------------+
| RANK |   ESTIMATOR |
+------+-------------+
|    1 |      xBABIP |
+------+-------------+
|    2 |      mBABIP |
|    3 |     qxBABIP |
+------+-------------+
|    4 |   xBA BABIP |
|    5 |       BABIP |
+------+-------------+
|    6 | studesBABIP |
+------+-------------+
|    7 |     ldBABIP |
+------+-------------+

I’ve also broken things down by tiers. Dutton and Bendix’s xBABIP seems to be the best, and I can only imagine what looking at multiple years of it would do. Just one year of data can explain 25 percent of the change in BABIP, a very big number for a stat with such wide variability. That it beats three years worth of Marcels data (plus regression to the mean and age adjustments) is excellent as well.

After that comes Marcels (which I’ve currently been using), and the quick version of xBABIP (which, I should note, doesn’t include a not-hard-to-apply team adjustment. I didn’t include it for some logistical reasons, but it would likely improve the accuracy a bit). It’s very nice to see the quick version grade out so nicely since it will be easy to calculate in-season (although thanks to Sal Baxamusa, Marcels isn’t very difficult either).

Then comes Baseball HQ (which Average Error thinks belongs in tier two) and actual BABIP, followed by Dave’s more complex BABIP estimator (which was derived back at the beginning of 2005 when we were first starting to work with batted ball data).

Finally, line drive BABIP — which is the arguably the most popular of any other measure on this list — comes in dead last, well below everyone else and significantly worse than simply using actual BABIP. I’ve long said that I dislike this way of estimating BABIP, and it’s very nice to see the tests confirm it.

Going forward

Going forward, I’ll be using xBABIP in place of Marcels BABIP in my True Batting Average calculations and when discussing a player’s BABIP in general. I’m committed to giving you guys the best there is, and Chris and Peter’s model is tops among any BABIP estimator that I know of. If you missed the original article, I’d definitely recommend you go back and read it.

A Hardball Times Update
Goodbye for now.

Some notes from Chris Dutton

Chris worked a lot with me on this, and I really appreciate his receptiveness and helpfulness. Here are some things he wanted me to pass along.

First, he has changed the model a bit since the original article. Here are the exact changes and his explanation of them:

Old formula: Hitter eye, Pitches per extra-base hit, LD%, FB/GB, Speed score, Contact rate, Spray, Pitches per AB
New formula: HR/FB, IF/FB, LD%, FB/GB, Speed score, Lefty*(FB/GB%), Contact rate, Spray

The differences are basically that I used hr/fb as a measure of power rather than pitches per extra base hit, added popups/FB to measure poorly hit balls, and included an interaction variable of lefty*(fb/gb%) to adjust for the fact that lefty ground ball hitters tend to often hit balls to the right side of the field (which rarely become hits). I also removed pitches_per_AB, which seemed to be potentially correlated with other variables, and removed hitter_eye since contact rate seemed to be capturing a very similar effect.

Chris also says that he’s isn’t done improving the model. He is constantly looking for ways to improve it even further, and is specifically hoping to incorporate some PITCHf/x data as soon as possible.

Finally, Chris is developing a tool that would allow readers to easily calculate the quick version of xBABIP. This would prove to be useful in-season when we constantly need to be changing our evaluation of hitters. While constantly calculating things like Spray would be time-consuming and difficult, the quick version utilizes stats that are all readily available and — as the tests show — is still effective. The tool also has some other cool features: interactive graphs, projected stat lines, and some other things you might find useful.

References and resources

Expected BABIP and Quick Expected BABIP data was provided for me by Chris Dutton. A big thanks to him for his help and also for helping to create such an excellent stat.

Marcels BABIP was taken from Tango’s site. The rest of the stats I calculated myself.


17 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
MattS
15 years ago

I wrote an article on BABIP a couple weeks ago that discusses breaking down BABIP on line drives, groundballs, and flyballs.  I model each individual one and predict the following year’s BABIP from the previous three.  I’m working on just a simple plug-in regression model to compute expected BABIP, but the variables that I use actually have a higher correlation with the following year’s BABIP than one using the available variables that xBABIP use, at least with my dataset.  Have a look.  I’m currently working on a newer article to come out soon on this topic.  Here’s the previous article discussing what I did.

http://www.thegoodphight.com/2009/1/16/726379/babip-projection-and-new-s

Dave Studeman
15 years ago

Very cool, Derek.  I’m glad folks like you and Chris have followed up on this, because my calculations were only retrospective in nature, and not meant to be predictive.

I was also curious because I know Chris and Peter used stats from Baseball Prospectus for their line drives, and I wondered how the formula would work with the BIS stats.  Looks like it works pretty well.

TangoTiger
15 years ago

Derek, good job.  I would be interested in seeing the correlation results of seasons 2005-2007 onto 2008 for all the non-Marcel seasons, and using the Marcel 2008 forecast onto the 2008 season.  The reason is that Marcel uses 3 years of data, so it has an unfair leg up on all the others.

Finally, how about a correlation of ALL the estimators onto the 2008 BABIP.  There’s no reason for an “either/or” choice.  Well, for things like ldBABIP, it can be discarded, since it’s a subset of a few of the other models you have.  (But, the regression you hopefully pick that out, and give it a weight of zero.)

Barca
15 years ago

“xBABIP wasn’t calculated for anyone with fewer than 300 plate appearances, so I made that the cut-off for both year one and year two”

Mike Napoli was the first one that I would like see calculations for.  but it appears that he just missed the cut-off.

nilodnayr
15 years ago

It would be interesting to see this broken down by batter type.  While one model might be best for all players, it might have deficiencies with certain types of players (power vs single, FB vs GB, high BA vs low BA, etc), which either might point us to use different models for different types of players or perhaps a continued tinkering of the xBABIP model.

Barca
15 years ago

“xBABIP wasn’t calculated for anyone with fewer than 300 plate appearances, so I made that the cut-off for both year one and year two”

Mike Napoli was the first person that I wanted to see the calcualations for, but it looks like it misses your cut off.

centris
15 years ago

Do you think that CHONE or PECOTA projected BABIP would beat xBABIP?

Red Sox Talk
15 years ago

Hi I also have done a study looking to improve on the LD% +.120 method. I looked at 449 hitters form 2008 with at least 300 PA:
http://saberrattling.wordpress.com/2008/12/03/working-the-numbers-on-babip-estimation/

mulkowsky
15 years ago

Great, cutting edge stuff!  (And Marcel continues to amaze.)

In the original article you guys posted a link to download Dutton and Bendix’s xBABIP results.  Could you do that again for their updated model?

philosofool
15 years ago

I find it ironic that the old formula .12+LD% is actually less reliable a predictor of future BABIP than BABIP itself.

BobbyRoberto
15 years ago

I noticed Derek’s BABIP calculation does not include sacrifice flies, while the one linked by Red Sox Talk did include sac flies.  Which is the right, or best, way to do it?  Shouldn’t everyone figure it the same way?

youngid
15 years ago

That’s an interesting approach, Red Sox Talk, if we ever get HitFX data I’m sure a variation of your formula will be a great BABIP predictor.

Brian Cartwright
15 years ago

Good work Derek. Confirms what I wrote a couple weeks ago at FanGraphs that Marcel is better than LD%
http://www.fangraphs.com/blogs/index.php/what-i-hate-about-line-drives/

Derek Carty
15 years ago

Dave,
I forgot to mention that your early work was meant to be retrospective.  Also, the data used in the formulas is the BP data, not BIS.  I simply took the values for xBABIP and qxBABIP that Chris provided me.  I think it would be interesting to see how BIS would work, or concoct a new model for it entirely.

Tango,
Thanks.  If I run a regression with all the estimators, I get an R-squared of 0.28.  Not too much better than just xBABIP, but better.  As suspected, ldBABIP wasn’t significant at any level, nor was actual BABIP (since it overlaps with Marcels, I’m assuming).

Also, you’re right that Marcels has a leg up on all of the systems since it uses three years of data.  Looking back, it seems like I only glanced over that point.  I’ll run a three-year average of the other systems tomorrow and see how they stack up.

Derek Carty
15 years ago

Sorry, Barca.  I didn’t put the stats together, but it looks like Napoli did miss the cut-off.

nilodnayr,
I’ve been thinking that as well, and it’s something I’ll probably test sometime in the future.

centris,
It’s difficult to say, but my guess would be no.  They aren’t incredibly superior to Marcels, and Marcels was a fair bit behind.  Even if they did beat xBABIP, I doubt it would be by very much, and if we were to look at more than one year of xBABIP I don’t think it would be very close.

Interesting, Red Sox Talk.  If I do another study like this, I’ll include your formula.

philosofool,
I feel exactly the same.  I’ve thought for a while that was the case, as did Brian Cartwright smile

BobbyRoberto,
I’ve seen it calculated both ways.  I don’t include Sac Flies because, for fantasy purposes, BABIP is used to help predict a player’s batting average.  It’s basically a player’s rate of hits independent of how often he makes contact with the ball.  Because sac flies aren’t counted as at-bats, they don’t affect batting average, and I don’t count them.

Derek Carty
15 years ago

mulkowsky,
I could ask Chris and find out if he’d like to put out a new file.

MattS,
Interesting stuff.  I’ll have to give it a deeper look when I get a chance tomorrow.

Stewart
13 years ago

If you were to give advice to me or my children, or even children to come in our family, what would it be?