Wednesday, October 21, 2009
Trust but Verify
Posted by Adam GuttridgeLast week, I was playing around with creating a BABIP predictor that was simple, intuitive, and calculable with the information available in my spreadsheets. I also stumbled across some brilliant work done by Peter Bendix at THT some time back.
As an offshoot of the BABIP estimator, I decided to start with a baseline reading of how predictable BABIP and batted ball types were, if given a decent sample. (I should note, this is BIS data).
I took all players with at least 400 PA in each season from 2006-2009 (98 players, 392 seasons). I wanted to see how well a simple 5-3-2 weighting of their ’06-’08 BABIP and batted ball data would predict their ’09 BABIP/batted ball (not weighted for PA). The results:
A few things jumped out at me:
--I’m surprised by how well BABIP works at predicting itself. I would have thought we’d see that figure much more towards LD%. That’s somewhat discouraging for me, though… it’s not going to be easy to come up with a BABIP estimator that makes a huge difference in terms of projection accuracy.
--LD%... damn. A lot of this is surely attributable to the well-discussed variability and subjectivity involved in batted ball classification, but even absent that, I’d bet this is just a flaky skill that needs to be heavily regressed.
--I’m also a bit surprised by how strong of a figure we see for HR/FB. The variability of that figure is the basis for xFIP, and sure, on a seasonal level, it deserves to be factored out. But for an SP with 180+ IP in each of the past 3 seasons, a projection using his (park factored) HR/FB as a component in lieu of the league average rate would likely produce superior accuracy.
Also, I checked up on Dave Studeman’s quick-and-dirty method of BABIP prediction; LD% + .12. If you add .12 to the LD% predicted by the ’06-’08 LD%, it correlates at .376 with actual ’09 BABIP. In other words, you’re far better off using plain BABIP.
Adam Guttridge is a recent graduate seeking to continue his baseball career. Employment offers can be sent to .(JavaScript must be enabled to view this email address).








It sounds like for most of the article you are talking about batting statistics but then the penultimate paragraph switches to talk about pitchers. I found it confusing. I think the persistence from one year to the next of these statistics is much different if we’re talking about pitchers.