The Hardball Times Fantasy

The great strikeout debate (Part II)

by Paul Singman
June 17, 2009

About a month ago I introduced the idea that the common measure of strikeout ability for pitchers, strikeouts per nine innings (K/9), is flawed and suggested a better measure which I named True K percentage. True K percentage is different from K/9 in that its baseline is at-bats instead of outs. It is also different from strikeout percentage (K%) in that walks are filtered out of the equation because I believe control and strikeout ability are two unrelated skills for the most part.

To get a better understanding of why some of these decisions were made the way they were, I encourage you to read over the first "Strikeout debate" article and the accompanying comments.

As another refresher, here are the exact formulas I am using:

K/9 = (K * 9) / IP

K% = (K / TBF) * 100

True K% = (K / K + BIP) * 100

Now that were have discussed the pros and cons of all three strikeout measures—K/9, K%, True K%—in a theoretical sense, let's roll out the numbers for each pitcher and see who they disagree on. The following chart shows the top 25 starters for each measure in 2009, with a minimum of five games started (165 starting pitchers qualify).

           K/9                            K%                          True K%
   1 Rich Harden         11.23    Javier Vazquez     31.16%    Rich Harden        34.64%
   2 Javier Vazquez      11.21    Justin Verlander   30.37%    Justin Verlander   33.76%
   3 Justin Verlander    11.05    Rich Harden        28.97%    Javier Vazquez     33.65%
   4 Jon Lester          10.62    Tim Lincecum       28.14%    Jon Lester         31.58%
   5 Tim Lincecum        10.53    Jon Lester         27.83%    Johan Santana      30.95%
   6 Johan Santana       10.37    Johan Santana      27.58%    Tim Lincecum       30.93%
   7 Jake Peavy          10.14    Jake Peavy         27.46%    Jake Peavy         30.77%
   8 Jorge de la Rosa     9.62    Zack Greinke       26.43%    Chad Billingsley   29.61%
   9 Jordan Zimmermann    9.47    Dan Haren          25.50%    Jorge de la Rosa   28.57%
  10 Chad Billingsley     9.46    Chad Billingsley   25.50%    Jordan Zimmermann  28.04%
  11 Daisuke Matsuzaka    9.29    Jordan Zimmermann  24.90%    Zack Greinke       27.87%
  12 Max Scherzer         9.27    Jorge de la Rosa   24.49%    Max Scherzer       27.82%
  13 Zack Greinke         9.25    Erik Bedard        23.99%    Yovani Gallardo    27.80%
  14 David Purcey         9.12    Max Scherzer       23.96%    Clayton Kershaw    27.56%
  15 Josh Beckett         8.96    Yovani Gallardo    23.91%    Dan Haren          27.44%
  16 Erik Bedard          8.91    Josh Beckett       23.24%    David Purcey       27.08%
  17 Jonathan Sanchez     8.89    Felix Hernandez    23.20%    Erik Bedard        27.08%
  18 Yovani Gallardo      8.88    Clayton Kershaw    22.79%    Edinson Volquez    26.86%
  19 Felix Hernandez      8.86    Wandy Rodriguez    22.73%    Josh Beckett       26.48%
  20 Clayton Kershaw      8.72    Roy Halladay       21.78%    Jonathan Sanchez   26.42%
  21 Dan Haren            8.62    David Purcey       21.67%    Joba Chamberlain   25.66%
  22 Edinson Volquez      8.52    Edinson Volquez    21.56%    Felix Hernandez    25.61%
  23 Wandy Rodriguez      8.47    Randy Johnson      21.48%    Wandy Rodriguez    25.51%
  24 Oliver Perez         8.31    Josh Johnson       21.33%    Randy Johnson      24.71%
  25 Joba Chamberlain     8.24    Jered Weaver       21.23%    A.J. Burnett       24.54%

As you can tell by looking across the rows and finding different pitchers, there are significant differences for a lot of them. Even the best strikeout pitcher is questioned with K/9 and True K% saying it is Rich Harden while K% likes Javier Vazquez.

Since we understand the formulas behind the three, we know why some pitchers are ranked higher in some than in others. A pitcher like Dan Haren will be ranked more highly by K% since he walks very few batters. And Oliver Perez has the greatest difference in K% and True K% because of his 8.72 BB/9 rate.

But what type of pitchers are ranked most different between K/9 and True K%? It is harder to define the type of pitcher so lets look at those with the biggest gaps.

The five pitchers with the greatest differential between their K/9 and True K% ranked higher by True K% are:
{exp:list_maker}Mark Buehrle
David Bush
Chris Carpenter
Johnny Cueto
Kyle Davies
{/exp:list_maker}
The five pitchers with the greatest difference between their K/9 and True K% ranked lower by True K% are:
{exp:list_maker}Chien-Ming Wang
Ricky Nolasco
Oliver Perez
Dana Eveland
Kevin Slowey {/exp:list_maker}
I was not exactly sure of the relationship between these pitchers until I had finished the list of pitchers that are ranked lower, and I realized that all of those pitchers had terrible starts with the exception of Slowey. Then I began thinking what caused their poor performance and realized BABIP had a lot to do with it. Checking out their BABIPs, I found that even Slowey has an unlucky BABIP of .351 and the group as a whole has an average mark of .406.

Then it was easy to realize the first group, those ranked higher, must have relatively low BABIPs. I was right; their collective average BABIP is .249, led by Carpenter's .210 mark.

If you think about it this should make sense that the difference is BABIP-dependent since for True K% you are dividing by all balls in play while in K/9 you are only dividing by those balls in play that go for outs. The difference between all balls in play and balls in play that become outs is balls in play that become do not become outs—or what you would otherwise call hits. And hits are the driving force between a high or low BABIP.

My next thought was that the pitchers whose rank is about the same for both measures True K% and K/9 would have BABIPs about league-average .300. That also turned out true as the 11 pitchers with no change in rank averaged a BABIP of .307.

This best shows how True K% is superior to K/9 because True K% is not wrongly affected by BABIP, which as far as I am aware of, is not something that should have any effect on a pitcher's strikeout rate.

You should not think of True K% as an attempt to predict K/9, you should use True K to completely replace K/9. After seeing the numbers for the first time, I began wondering how many times I must have quoted a pitcher's decreased K/9 rate as the reason for his problems when really it was poor BABIP luck showing up again in his strikeout rate.

The main practical use of True K% that you can identify some pitchers whose perception of their skills is incorrect, making them good trade targets. In the next article, I will get a spreadsheet up of the True K% numbers for all pitchers over the last few seasons and point out some specific pitchers whose perception of their strikeout ability may be off because of a difference in their K/9 and True K numbers. Now I will turn over the floor to any of your thoughts...

Thank you to Fangraphs for data and Derek for discussing the True K formula with me.

Paul has been managing fantasy baseball teams for many seasons and writing for THT Fantasy over the past three years. He is currently a student at UPenn welcomes readers' thoughts at his email here or in the comments below.


<< Return to Article