The great strikeout debate

by Paul Singman
May 12, 2009

Toward the end of last season at the beginning of September, I wrote an article (at my former site) advocating the use of strikeout percentage (K%) over the more commonly used strikeout per nine innings (K/9).

Derek and I continued to have a good discussion on the topic following the article, which kept me thinking about the issue. Is K% an improvement to K/9? Is there another stat, not yet created, that would better show a pitcher’s ability to get strikeouts?

I did not reach a definite answer to these questions, but after playing devil’s advocate in my mind a few times (as I will in this article by the way) I feel I have at least made progress on the answer, which lies in understanding K% and K/9, and their similarities, differences, and flaws.

The biggest difference between K% and K/9 is their baseline. K% is strikeout per batter faced while K/9 is strikeout per inning, which is essentially per out.

A baseline of per out is good because every inning, a pitcher must get three outs. How many hits or walks he allows in that time serves only to inflate the number of batters he faces. He must, however, face three batters that get out. Must. K/9 isolates this, ignoring hits and walks, and shows us how many batters he gets out via strikeouts, holding everything else constant (more or less).

A baseline of per batter faced can also be argued as good because it shows, quite clearly, how often a pitcher can strike a batter out and how often he cannot. It does not matter what the non-strikeout outcome was—be it walk, hit, or ball in play out—if the pitcher could not strike the batter out, they are not as good as someone who could.

Proponents of K/9 could argue that including walks in the K/9 equation is detrimental because control is a different skill that should not be taken into account when trying to determine a pitcher’s strikeout ability. Proponents of K% could counter that walks should be included because they represent a batter that the pitcher could not strike out.

Both stats do have a major flaw, most notably their dependency on BABIP. Consider the following two innings of work:

Pitcher A
Ground out
Ground out
Strikeout

Pitcher B
Ground out
Ground ball (hit)
Walk
Strikeout
Fly out

Here Pitcher A would have a K/9 of 9.00, as would Pitcher B. Pitcher A’s K%, however, is 33.33 percent while Pitcher B’s is 20 percent.

This certainly leads one to believe that K% wrongly takes into account ball in play outcomes and K/9 is better because it does not. This argument can be flipped onto itself to prove K/9’s dependency on BABIP too, though. Notice how in Pitcher B’s inning of work, one ground ball went for an out and another went for a hit.

Oddly, even though both are ground balls, the outcome of the ground ball—hit or out—determines whether that batter affect’s the pitchers K/9 rate. When the ground ball goes for a hit, the K/9 remains unchanged. But when the ground ball is converted into an out, the K/9 rate will go down because an out was made that was not a strikeout. That does not seem right.

Taking a step back, it seems we have done a good job of pointing out the strengths of weaknesses of both stats. With the flaws both have, I think it is possible to create a new, better stat. To do this, I will take what consider the best of both K/9 and K%.

The per out baseline of K/9 is too illogical, only counting balls in play when they go for outs, and therefore I like the batters faced baseline of K% better. I do like the way K/9 ignores walks, which should be kept separate from the ability to strike batters out.

From these two preferences arise the new stat whose equation is K/(K + BIP) and I will call it True K for now, or TK.

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

Do I think TrueK is perfect? No. But I do believe it is better at showing who the best strikeout pitchers are. Agree? Disagree? Let me know your thoughts in the comments below

10 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Ian

14 years ago

I think K/9 is more effective for fantasy use because it relates directly to the innings limit of the league.

For predictive use, there’s an argument to be made both ways, but I still prefer keeping everything in terms of outs/IP.

Justin

14 years ago

We’re trying to measure skills, yes? If so, then K% (and BB%) are clearly better measures than K/9 and BB/9. This is especially important early in the season when BABIPs are more likely to be skewed by good or bad luck. So pitchers with high BABIPs are going to face more batters and as a result have inflated K/9 rates. This can sometimes cause analysts who use K/9 to misinterpret their skill sets and therefore their expected performance going forward. I do like the idea of isolating Ks from BBs – measuring how often a batter was able to put the ball in play when he was forced to. But between K% and K/9 I see no comparison. (Though THT’s measure of K/G is really nice since it normalizes to figures we’re used to seeing in K/9.)

For walks, I’d like to see it become common to remove intentional passes from the measure: (BB-IBB)/BF. This is important for a guy like Nolasco, whose BB% is dramatically inflated this year as 30% of his walks have been intentional thus far.

Matt Swartz

14 years ago

I really think K% is much better than K/9. Excluding walks does not make sense, if you think about it. Last year 48% of unintentional walks came with two strikes on a hitter, and I would bet that a large portion of those came with foul balls. The pitcher clearly failed to generate a swing and miss, which is what good strikeout pitchers do. Excluding IBB (as someone mentioned above) is a very good point, but a walk represents a pitchers inability to strike a guy out. Throwing pitches on the black and throwing pitches in locations such that they look hittable are both skills.

John L.

14 years ago

What about swinging strike percentage? It is a much larger and more stable sample, but has obvious bias against pitchers who get lots of looking strikes (Mike Mussina). But I think it’s particularly useful in the case of young pitchers with limited major league samples. Dallas Braden for example. He’s posted a K/9 of over 10 in the minors but a pedestrian 6.0 in the majors. Even though it’s a limited sample I already know by his swinging strike % that he’s not going to strike out anything resembling his minor league numbers because he’s getting 5% less swinging strikes (12.7 and 12.8% in 07-08 in the minors, and 6.7% and 8.2% in the majors). I can also be more confident in a guy like Scott Richmond even though I only have a month’s worth of sample size. At AAA last year he got 11.6% swinging strikes compared to 9.9% in the majors and 11.5% so far this year. So it’s fair to assume his minor league K rate will transfer if not improve.

It’s also a great way to get a leg up on guys you suspect are injured or are coming off injury. If John Smoltz comes back and isn’t getting swinging strikes in his first few starts I still might be able to sell high. Conversely if he struggles but is getting swinging strikes at his previous rate he’s a great buy low option.

Matt Swartz

14 years ago

That’s interesting. Maybe rather than what percent of strikes are swinging, maybe it’s better to say what percent of swings are misses? So it’s (swinging strikes)/(swinging strikes + swing with contact), rather than (swinging strikes)/(swinging strikes + called strikes).

The only thing is that there are probably pitchers are good at anticipating what the better is expecting and getting swinging strikes at the right time. If you know a hitter is likely to foul off a changeup on an 0-1 count, you throw it. If you know a hitter is likely to foul of a changeup on an 0-2 count, what’s the point? Quickly running a couple correlations on a dataset I have…from 2005 to 2008, K% has a .72 correlation with previous year’s K% and only -.65 correlation with previous year’s contacted/swing rate. I think there is a skill to knowing how to strike a batter out, given what your stuff is.

John L.

14 years ago

Right. I’m referring to statcorner.com’s SwStr% which is purely the percentage of pitches swung at and missed (not counting foul balls). I’d guess that it favors pitchers who throw changeups and sliders as strikeout pitches, and is considerably biased against pitchers with great curveballs who freeze hitters. So using it by itself as a strict measure for strikeouts isn’t really plausible.

But like I said, when you only have small sample sizes available that are prone to luck (Which is the case this early in the season), SwStr% might be a good place to look to see if a K rate is sustainable. If a pitcher is veering considerably one way or another it could be the sign of a new skill set, or the decline of one (or possibly injury).

Paul Singman

14 years ago

Great discussion, guys. I like the idea of using SwStr% percentage to predict K rate, although that is jumping past the question of how should we measure K rate in the first place.

I do think there is progress to be made in predicting what a pitcher’s true strikeout ability is, so I’ll continue to think about the issue…SwStr% is a good place to start.

david h

14 years ago

I’m not sure I’d prefer to exclude walks. K’s often result from deceptive pitches getting batters to chase stuff out of the zone. Keeping walks in the equation allows sensitivity for the skill of getting batters to swing at the crap you throw out of the zone rather than recognize it and not swing. Also, K’s are related to pitch speed, and keeping walks in the equation incorporates the ability of pitchers to get something extra on their pitches without sacrificing control.

Derek Carty

14 years ago

Hey guys,
I haven’t been able to give this as much thought as I’d really like to yet, but here are my thoughts so far.

I believe that strikeouts and walks should be treated differently. However we want to measure either one, I don’t think the other should be included in the formula. The correlation between the two is rather low (to put a quick, rough number to this, qualified pitchers in 2008 had just an 0.02 R-squared between K/9 and BB/9—this means that K/9 explains just 2% of the variance in BB/9). As such, it’s my belief that the two are – in large part – separate skills. Sure, there are some situations that arise like Matt Swartz brought up, but by-and-large, I think the two need to be treated separately.

Also, for those who remember Pizza Cutter’s article on plate discipline (http://www.philbirnbaum.com/btn2007-02.pdf) that I referenced when I introduced my own plate discipline stats, Pizza said that “the small value of the correlation, however, suggests that a player’s likelihood of walking and striking out are largely unrelated. The evidence suggests that walk and strike out are two very different concepts.” In other words, the two are not opposites – they are two distinct skills and cannot be lumped together in the same basket. He goes on to say “If anything, it appears that walk and strikeout are more similar than different and that their opposite is actually putting the ball into play.” This was said about hitters, but since the correlation is also low for pitchers, I think the logic applies here as well.

Again, I need to think some more about this, but to me, I think something like K/(K+BIP) is better than K/(K+BB+BIP).

Just from a purely logical standpoint, a pitcher can improve his control without sacrificing strikeout ability. It’s not incredibly easy, but there are plenty of examples of it. Check out Tim Lincecum 2007-2008. If a pitcher starts walking fewer batters while striking out the same, his K% will increase despite his strikeout ability remaining exactly the same. I suppose you could argue that it has changed because he’s faced fewer batters, but I don’t buy it. I believe the two are separate skills and should be treated as such.

Oh, and I absolutely agree with Justin on removing IBB from the BB equation. This is a very simple thing to do and something I’ve been meaning to do for a long time.

david

14 years ago

Show us a chart of “True K” for some pitchers this year.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG