December 7, 2013

THT Essentials:
 Fangraphs Player Search:

#### Get It Now!

The tenth Hardball Times Annual is now available. It's got 300 pages of articles, commentary and even a crossword puzzle. You can buy the Annual at Amazon, for your Kindle or on our own page (which helps us the most financially). However you buy it, enjoy!

And here's the full roster.

#### THT's latest e-book

Third Base: The Crossroads is THT's new e-book, available for \$3.99 from the Kindle store. The good news is that anyone can read a Kindle book, even on a PC. So enjoy the best from THT in a new format.

## Most Recent Comments

Get your very own THT merchandise from our CafePress store. We've got baseball caps, t-shirts, coffee mugs and even wall clocks with the classy THT logo prominently displayed. Also, check out the THT Bookstore. Please support your favorite baseball site by purchasing something today.

## Search THT:

Or you can search by:

All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.
 Roll mouse over date for entries THT Live Calendar << December 2013 >> S M T W T F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

## Trust but Verify

Posted by Adam Guttridge
Last week, I was playing around with creating a BABIP predictor that was simple, intuitive, and calculable with the information available in my spreadsheets. I also stumbled across some brilliant work done by Peter Bendix at THT some time back.

As an offshoot of the BABIP estimator, I decided to start with a baseline reading of how predictable BABIP and batted ball types were, if given a decent sample. (I should note, this is BIS data).

I took all players with at least 400 PA in each season from 2006-2009 (98 players, 392 seasons). I wanted to see how well a simple 5-3-2 weighting of their ’06-’08 BABIP and batted ball data would predict their ’09 BABIP/batted ball (not weighted for PA). The results:

A few things jumped out at me:

--I’m surprised by how well BABIP works at predicting itself. I would have thought we’d see that figure much more towards LD%. That’s somewhat discouraging for me, though… it’s not going to be easy to come up with a BABIP estimator that makes a huge difference in terms of projection accuracy.

--LD%... damn. A lot of this is surely attributable to the well-discussed variability and subjectivity involved in batted ball classification, but even absent that, I’d bet this is just a flaky skill that needs to be heavily regressed.

--I’m also a bit surprised by how strong of a figure we see for HR/FB. The variability of that figure is the basis for xFIP, and sure, on a seasonal level, it deserves to be factored out. But for an SP with 180+ IP in each of the past 3 seasons, a projection using his (park factored) HR/FB as a component in lieu of the league average rate would likely produce superior accuracy.

Also, I checked up on Dave Studeman’s quick-and-dirty method of BABIP prediction; LD% + .12. If you add .12 to the LD% predicted by the ’06-’08 LD%, it correlates at .376 with actual ’09 BABIP. In other words, you’re far better off using plain BABIP.

Adam Guttridge is a recent graduate seeking to continue his baseball career. Employment offers can be sent to .(JavaScript must be enabled to view this email address).

Detroit Michael said...

It sounds like for most of the article you are talking about batting statistics but then the penultimate paragraph switches to talk about pitchers.  I found it confusing.  I think the persistence from one year to the next of these statistics is much different if we’re talking about pitchers.

Posted 10/21  at  04:44 PM
Derek Carty said...

I did a quick bit of research here about BABIP, xBABIP, and other estimators.  Similar findings - i.e. LD%+.120 is no good for forward looking stuff.

Posted 10/21  at  06:53 PM
Nick Steiner said...

You only tested hitters right?  That’s why you see a high correlation for HR/FB, it’s not really a “luck” stat for them.  If you tested pitchers, I bet you would find a much weaker relationship between year 1 and year 2.

Also, are these R or R^2?

Posted 10/21  at  07:49 PM

R, Nick.

And yes, it would be interesting to test pitcher’s HR/FB. I perhaps ran a bit far with the assumption that if it would be as predictable for hitters as pitchers (like most stats, ie BABIP, are).

And Derek…. that’s wonderful work. The problems with xBABIP and even qxBABIP is that logistically, the’yre awfully difficult to pull of with my sheets. I’ve had some success thus far with GB rate and a speed index. I’m optimistic I’ll be able to put together something that gets it into .7 territory.

Posted 10/21  at  09:46 PM
Dave Studeman said...

I don’t know how many times I’ve said this, but my formula, basing BABIP off LD%, was never meant to be used to “predict” future BABIP.  It was meant as a way to judge how lucky/unlucky the hitter or pitcher was in retrospect only. I think I mentioned it just twice in articles and never used it to predict future BABIP.  Dutton and Bendix took it out of context in their study.

This kind of information was studied in detail by David Gassko and JC Bradbury in two old THT Annuals.  I highly recommend people read those.

Posted 10/22  at  02:31 PM