“It has been proposed for some time that fly ball pitchers tend to have an advantage here over ground ball pitchers. In the end, I’m pretty sure this is right, but the problems currently here are tough to overcome … There are really three types of batted balls as they are counted: ground balls, fly balls and line drives … When I (or someone else for that matter) can work this out, we’ll have DIPS Version 3.0.” – Voros McCracken
Almost six months ago, I wrote an article on THT discussing the impact of batted balls on Defensive Independent Pitching Statistics (DIPS). When I wrote that article, I didn’t realize it would lead to me writing regularly on THT, and to many cool things. But I’m happy it did, and today, I’d like to revisit the subject.
JC Bradbury and I wrote an article in the Hardball Times Annual 2006 (seriously, if you have yet to buy it, do so now, procrastinator) discussing the impact batters and pitchers have on what type of ball is put in play, and its outcome. While the whole article is, in my extremely biased view, worth reading, let me summarize the pitching part of it: (1) Pitchers have great control over whether a ball becomes an outfield fly ball, infield fly ball, or ground ball, but show no consistency in the number of line drives they allow from one year to another; (2) Pitchers have pretty much no impact on whether a batted ball becomes a hit, and if it does, what kind of a hit it becomes; and (3) This includes home runs, which seem to be solely a function of the number of outfield flies and line drives a pitcher allows.
Why is that important? It tells us in what areas pitchers do and do not exhibit control. These findings allow us to strip away the “noise” in a pitcher’s performance, and tell us how he performed, independent of his defense. For example, we know that Batting Average on Balls in Play (BABIP), which Voros McCracken set equal to league average for every pitcher in his two versions of DIPS, is determined mostly by the percentage of batted balls a pitcher allows that are line drives. So if a pitcher allows many line drives, he’ll have a high BABIP, and if he allows few line drives, his BABIP will be low.
Since pitchers display no year-to-year consistency in the percentage of batted balls they allow that are line drives, we would then expect them to show little year-to-year consistency in BABIP, which is what happens. Voros’ great discovery in DIPS was that a pitcher’s BABIP in one year was a poor predictor of his BABIP in another year. Our article explains why.
Here’s why that’s important in terms of DIPS 3.0: What Voros did with DIPS is ask the question, “Over what components to pitchers display great control over, if we define control as a strong relationship between their performance in that category one year and in the next?” So if a component exhibited a strong year-to-year correlation, Voros kept it in. That applies to home runs, walks, strike outs, and hit batters. But hits on balls in play show little correlation from year-to-year, which is why Voros simply replaced a pitcher’s true performance in BABIP with the league average, and adjusted accordingly. With DIPS 3.0, I’ve decided to take the same approach.
It’s probably a valid criticism that what I’m doing isn’t quite separating pitching from defense. If a pitcher allows more line drives than the league average, he does that independently from his defense. There’s no reason to dock his fielders for that. So in that sense, I should not be substituting a league average LD% for a pitcher’s actual line drives allowed. But in my opinion, that violates the spirit of DIPS. Here’s a simple way of looking at it: If I used actual line drive percentage, and then attempted to predict a pitcher’s BABIP from his batted ball data, I’d come pretty close to the BABIP he actually posted. But in DIPS, we want to use a league average BABIP, or so. So in that case I do want to replace a pitcher’s LD% with the league average.
In my opinion, the spirit of DIPS is not separating pitching from defense (it’s far too coarse a metric for that), but rather getting rid of the “noise” in our data. It’s about looking at what a pitcher actually has control over, rather than getting caught up in the part of a pitcher’s line that the hurler has little or nothing to do with. This might be a good time to bring up that I have Voros’ blessing to do this, and that he has allowed me to call my metric DIPS 3.0, for which I’m thankful, as well as for his general contribution to the baseball statistical community.
That being said, I have re-tooled the way I calculate DIPS 3.0 quite a bit. I’d like to lay out the whole process here with a step-by-step example:
1. First, I take a player’s balls in play (BIP), and multiply that by the league-average line drive percentage. That’s his “new” line drives allowed; let’s call it nLD. For example, Jarrod Washburn allowed 586 BIP last season, and the average LD% in the AL last season was 19.9%, so he is credited it .199*586 = 117 nLD.
2. I then subtract a player’s nLD from his BIP, and call that his nBIP. So Washburn would have 586 – 117 = 469 nBIP.
3. I then find what percentage of a player’s non-LD BIP were each other batted ball type, and multiply that to find his nOFflies, nIFflies, nGroundballs, and nBunts. For example, Washburn originally allowed 466 non-LD BIP last season, but his nBIP is 469. Since he allowed 206 outfield fly balls last season, his nOFflies would be equal to 206/466*469 = 208. Rinse and repeat for the other batted ball types.
4. I convert the pitcher’s new batted ball line into hit-types. For example, the average line drive became a single 50.8% of the time in the American League, so Washburn’s predicted number of singles off line drives would be .508*117 = 59. If we do the same for all the other batted ball types, we find that Washburn “should have” allowed 120 singles last year. I go through this process for singles, doubles, triples, home runs, double plays, and reached on error. The ROE are important because they allow us to largely avoid the one big problem with normal ERA: It overrates groundball pitchers, because groundballs result in errors more often than balls in the air.
5. Using basically the same process as in step four, except for outs, I find a player’s expected innings pitched. This was a foolish oversight in my original system, because if we’re going to find the expected number of runs a pitcher will allow based on his defense-independent numbers, it only makes sense to find the number of outs he is expected to get as well. For example, the average line drive resulted in .293 outs last season in the American League, so Washburn is credited with .293*117 = 34 outs on line drives last season. After converting the batted ball information into expected outs, I add in strike outs, as they are pretty much a guaranteed out. By this formula, Washburn was expected to throw 170.4 innings last year, whereas he actually piled up 177.1.
6. I plug the pitcher’s expected opposition batting line as calculated in steps four and five into the BaseRuns formula. This makes my weights for various events much more accurate than in the original version of DIPS 3.0, and also avoids the problems a linear formula has at extremes. Our man Washburn was expected to allow 98 runs.
7. Finally, I divide the pitcher’s BaseRuns by his expected innings pitched and multiply by 9. That’s pretty straightforward, and that will give you his DIPS 3.0 RA. Washburn’s DIPS 3.0 RA last year was 5.19, compared to an actual RA of 3.35. He’s a good bet to do much worse next season. Remember, though, that DIPS 3.0 is expressed in terms of runs allowed per nine innings, and not earned runs.
That might have sounded more complicated than it is. Really, the process is very straightforward.
So here’s the question that every reader is probably asking right now: Why? Why is this important? Why do we want to know DIPS 3.0? Why is this any better than regular old DIPS or FIP?
My answer: Because it includes more data. Generally speaking, the more data you have the better. That especially applies here, when there is so much variation in official pitching statistics. For example, we know that a pitcher has some control over his BABIP. We even know how, for the most part, he can control it. But there was no statistic that reflected that knowledge prior to DIPS 3.0.
This system allows us to understand and evaluate a pitcher’s performance in more granular ways than ever before. It tells us when something fluky has happened, and gives us a better idea of what to expect the rest of the way or in the future. DIPS 3.0 gives us a better picture and understanding of pitcher performance, and that is why it’s important. DIPS 3.0 is the next logical step in defense-independent pitching analysis because it incorporates batted ball data and largely corrects for the noise contained within that data. Voros himself acknowledged the importance of developing such a system three years ago.
I believe I have realized it.
References & Resources
I am making available two spreadsheets with DIPS 3.0 data for 2004 and 2005. The numbers in these spreadsheets, except for plate appearances, walks, strike outs, and hit by pitches, are expected numbers based on the pitcher’s batted ball distributions, rather than actual numbers. I encourage people to play around with the data, and see what they can find.