A Quick Look at Four Hitting Ratesby Dave Studeman
March 08, 2007
When Voros McCracken invented DIPS (which stands for Defense-Independent Pitching) several years ago, he broke down pitching and batting performance into four simple ratios. DIPS has become fairly well-known and quite controversial since then, but those ratios have been largely forgotten by the general baseball community—I'm not aware of any website that lists them on a regular basis.
So, as long as we're in the middle of that pregnant pause known as spring training, I thought I'd remind you what they are, and how powerful they can be. Here is each one, with a definition:
- Walk rate (walks divided by plate appearances; you can include HBP, too)
- Strikeout rate (Strikeouts divided by at-bats. In other words, not including walks and HBP. You can add back in sacrifice flies if you want)
- Home run rate (Home runs divided by batted balls)
- BABIP (our old standard—the proportion of batted balls that fall in for a hit, not including home runs)
For pitchers, you often see strikeout and walk rates, but as a proportion of total plate appearances. Here, they are treated sequentially. The strikeout ratio implies that a pitcher can't strike out a batter if he walks him first, so the formula takes walks out of consideration when calculating strikeout rates.
Jim Albert has researched these deceptively simple ratios. In SABR's By the Numbers (February 2006), he found that the four rates did a better job of predicting a team's runs scored than OPS or Runs Created. He also wrote another article for the Journal of Quantitative Analysis in Sports (try finding that at your local bookstore) that focused on pitcher strikeout rates and concluded that Dazzy Vance's 1924 was the greatest strikeout performance of all time. Our own Steve Treder could have told him that.
Anyway, there's something important mathematically here. Each one of these rates is a binomial probability, like flipping a coin. Either the event (walk, strikeout, home run or hit) happens or it doesn't. That allows you to do some interesting things with the ratios, such as estimating the approximate randomness of each one. As you might expect, Albert found that three of the four ratios were relatively predictable. The fourth, BABIP, was much more random.
What's more, BABIP doesn't correlate highly with the other rates. There is a natural positive relationship between walks, strikeouts and home runs, but not batting average on balls in play. BABIP is the orphan ratio.
To invite a little more consideration of the four rates, I calculated the 2006 leaders and laggards for each one. To compute them, I first took intentional walks and sacrifice bunts out of total plate appearances. I included HBP in the walk rate calculation. My "sample" consisted of all batters with at least 400 plate appearances.
First, the walk rates:
Best Worst Player BB% Player BB% Giambi J. .201 Olivo M. .027 Ensberg M. .201 Cedeno R. .029 Bonds B. .191 Estrada J. .030 Abreu B. .178 Uribe J. .031 Johnson N. .177 Betancourt .031 Burrell P. .171 Berroa A. .032 Thome J. .169 Cano R. .034 Hafner T. .166 Francoeur J .038 Dunn A. .158 Rodriguez I .040 Ramirez M. .157 Payton J. .043I admit that I had no idea Morgan Ensberg would rank so highly. I should have expected it; after all, he batted .235 last year but had an OBP of .396. The "worst" list consists of a bunch of middle infielders and catchers. And Jeff Francouer.
Walk rates are fairly predictable from year to year. Last year's leader was also Jason Giambi.
Next, the strikeout rates (or, putting the positive spin on it: contact rates):
Best Worst Player K% Player K% Pierre J. .054 Dunn A. .344 Polanco P. .058 Howard R. .308 Garciaparra .063 Hall B. .299 Lo Duca P. .074 Thome J. .296 Eckstein D. .082 Gomes J. .294 Catalanotto .084 Granderson .289 Walker T. .085 Shelton C. .286 Vizquel O. .087 Edmonds J. .285 Sanchez F. .088 Burrell P. .281 Hatteberg S .089 Bautista J. .272The usual suspects, but I have to sheepishly admit (again) that I didn't realize Bill Hall struck out quite that often last year. Turns out that he struck out 162 times! Of course, he also had a career year in most every other batting category. Most of the worst strikeout rates belong to some of the majors' best sluggers. In fact, Albert found that strikeouts have the smallest impact on run scoring of any of the four ratios.
The rate with the largest run impact is the home run rate. Here are the leaders and laggards:
Best Worst Player HR% Player HR% Howard R. .143 Kendall J. .002 Hafner T. .121 Taveras W. .002 Ortiz D. .121 Punto N. .003 Thome J. .120 Gathright J .003 Dunn A. .108 Eckstein D. .004 Giambi J. .107 Roberts D. .005 Dye J. .103 Pierre J. .005 Berkman L. .103 Miles A. .005 Pujols A. .100 Clayton R. .005 Thomas F. .100 Ausmus B. .005Of course, there is a correlation between strikeouts and home run rates, as Phil Birnbaum pointed out in this article. The more you strike out, the more likely you are to hit a home run. Isn't it sort of odd to see Jason Kendall's name right there next to Willy Taveras and Joey Gathright?
The last ratio, BABIP, has the second-largest impact on run scoring. But, as I noted earlier, it is the orphan of the group: random and unrelated to the other rates.
Best Worst Player H% Player H% Jeter D. .391 Molina Y. .226 Cabrera M. .379 Uribe J. .240 Abreu B. .366 Barmes C. .241 Johnson R. .366 Gomes J. .244 Paulino R. .365 Giambi J. .245 Mauer J. .364 Thomas F. .247 Sanchez F. .364 Griffey Jr. .248 Cano R. .359 Bonds B. .251 Howard R. .356 Ensberg M. .251 Ethier A. .354 Biggio C. .254Morgan Ensberg again. Good bet for 2007 is that Ensberg's batting average will jump back up. On the other hand, check out Ryan Howard on the "Best" list. A home run hitter with a high BABIP is an awesome recipe for success. Unfortunately, I would guess that he won't maintain that BABIP this year.
Derek Jeter's .391 is tremendous, of course. He's always among the league leaders in BABIP, but don't bet on him hitting .391 again.
Finally, Albert outlined a formula to turn these rates into runs per game approximations. I thought it would be fun to apply his formula to our rates to see which players contributed the most and least runs per game. As you can see, the two lists are pretty predictable, closely following the leaders and laggards of Runs Created Per Game.
Best Worst Player R/G Player R/G Howard R. 9.2 Barmes C. 2.2 Hafner T. 8.9 Berroa A. 2.5 Ramirez M. 8.5 Cedeno R. 2.5 Pujols A. 8.4 Molina Y. 2.7 Thome J. 8.2 Anderson B. 2.8 Berkman L. 8.0 Ausmus B. 2.9 Ortiz D. 7.8 Everett A. 3.0 Dye J. 7.7 Gathright J 3.1 Jones C. 7.7 Uribe J. 3.2 Cabrera M. 7.4 Clayton R. 3.2To the extent that BABIP drove each player's relative standing, you can expect his performance to change next year. For instance, expect improvement from Clint Barmes, Juan Uribe and Jonny Gomes, and some decline from Howard, Jeter and Cabrera.
References and Resources
Albert's formula for runs per game is -3.2 + 13.2*BB% - 12.3*K% + 40.9*HR% + 24.5*BABIP. Please note that these BABIP figures differ slightly from the BABIP figures posted in our Statistics section, because I included sacrifice flies in the denominator.
Dave was called a "national treasure" by Rob Neyer. Seriously. Comments about this article can be sent to him through the miracle of e-mail.