Tuesday, June 16, 2009
Explaining LIPSPosted by David Gassko at 3:30am
It’s been almost four years since I first tried to devise a defense-independent pitching metric that incorporated batted ball data. I was inspired, then, by Voros McCracken’s articles on DIPS, both the original where he showed that pitchers appear to have little control over the results of balls put into play against them, and his follow-up, where Voros examined various improvements that could be made to DIPS, one of which was to incorporate batted ball data.
For years, I’ve been tinkering with various ways to do just that. The first incarnation of this statistic I called DIPS 3.0 (since Voros had already released two versions), but since I’ve switched to LIPS, which stands for “Luck Independent Pitching Statistics.” See, in my research I have found that not only do pitchers have little control over the results of their balls in play, but they also have little control over the number of home runs they allow, outside of their flyball or groundball tendencies. I repeat: Outside of forcing ground balls, an ability by the way, which is very persistent, there is little a pitcher can do to prevent home runs.
In light of this, we must re-assess Voros’ spectrum of what a pitcher can and cannot control. Rather than giving a pitcher credit for his strikeouts, walks, hit-by-pitch, and home runs and ignoring everything else as Voros did, we want to give him credit for his strikeouts, walks, hit-by-pitch, infield flies, outfield flies, and ground balls, while ignoring or adjusting everything else. At this point, we are not just removing defense from the equation, but luck itself, which is why I eventually changed the name of my statistic from “DIPS 3.0” to “LIPS.”
So how do we calculate LIPS? It’s a complicated process, one which has undergone many revisions, so in the interest of making it clear to all, I thought I’d show you through an example, using Rich Harden as my guinea pig (note that I haven’t updated my database in about a week, so these stats are a bit dated):
- Harden has allowed 10 infield flies, 35 outfield flies, 23 line drives, 41 ground balls, and 6 bunts. Because we know that a pitcher has little or no control over his line drive rate, we replace Harden’s line drive rate with the league average and adjust the rest of his batted ball numbers accordingly. Rounding, that leaves us with 10 infield flies, 36 outfield flies, 22 line drives, 42 ground balls, and 6 bunts. Basically, there’s no change for Harden specifically, though other pitchers may see big variations. (Also note that BIS actually breaks out another category for us, called a “fliner,” which is something between a fly ball and a line drive. In the actual LIPS calculations, these are treated as a separate category, but for simplicity, I’ve lumped them in with fly balls and line drives here.)
- We now multiply all of Harden’s transformed batted ball statistics by the league average outcome rates for each. So if the average National League pitcher allows 0.21 singles per ground ball, we calculate that Harden will allow 0.21*42 = 8.8 ground ball singles. Do that for every outcome and every batted ball type and you get a predicted pitching line. In this case we predict that, independent of luck, Harden would have allowed about 23 singles, 7 doubles, 1 triple, 4 home runs, and 1 reached on error. His actual line is, 23 singles, 8 doubles, 0 triples, 8 home runs, and 1 reached on error. You can see that the two lines match in just about everything except for home runs. (We also predict outs on balls in play and double plays using the same method, by the way.)
- Next, we adjust Harden’s strikeout, walk, and hit-by-pitch numbers for park. This is done because all of the numbers we derive in step (2) are park neutral, and we don’t want to mix apples and oranges, or in this case, park-adjusted stats with non-park-adjusted stats. Harden doesn’t see much change here, going from 53 strikeouts, 21 walks, and 2 hit-by-pitch to 52 strikeouts, 21 walks, and 2 hit-by-pitch.
- Now we throw the results of steps (2) and (3) into the BaseRuns formula. You can read more about BaseRuns here, but the basic idea is that it is the most accurate, least biased run estimator around. At this point, we know how many runs we expected Harden to allow, which in this case is 16.
- We now adjust step (4) for park factor—in the reverse. Our estimated runs allowed are park neutral, but we want them to be directly comparable to ERA. Therefore, we multiply by the park factor instead of dividing as you do when trying to make a statistic park neutral. Harden’s estimated runs allowed are increased four percent, as Wrigley Field is a hitter’s park, yielding a new estimate of 17 runs
- We estimate luck independent innings pitched by adding together the number of expected outs and double plays we got in step (2) with the adjusted strikeouts we got in step (3), and dividing by three. We get 41 luck-neutral innings pitched.
- We divide step (6) by step (5), and multiply by 9. Without rounding, the result is 3.63 runs per nine innings.
- Finally, we estimate how many unearned runs Harden should have allowed based on his ground ball rate, which has a very high correlation with unearned runs. We find that around 93 percent of Harden’s runs are expected to be earned, yielding a LIPS of 3.39.
That’s the basic process. OK, I understand that it’s anything but basic, but I hope my explanation was simple enough for all to follow. Every step is based on thorough research, a lot of which you can read in The Hardball Times Annual 2007 if you so desire, but otherwise you’ll have to take my word for it. LIPS takes the luck out of pitching statistics better than any other such stat I’ve ever read about, and that’s why we use it so often here at THT Fantasy.
If you have any questions, fire away in the comments section and I’ll try to answer them as best I can.
David Gassko is a former consultant to a major league team. He welcomes comments via e-mail.