Hit Rate Observer

So you want a starting pitcher who’ll get you a win. Which do you choose?

“A good one.” Yes—but pitchers who are good at getting wins might be less good at other things…

Let’s start here: Suppose you have a 4.50 ERA pitcher. How many wins can he be expected to get? I am sure we could run a calculation based on overall runs allowed and runs scored and get an approximate total for the season. However, since wins are accumulated by game, it may be fruitful to stick to that level. (For the discussion that follows, I will use ERA as our guide, though certainly RA would be a touch better.)

Now, a “W” is a matter of bookkeeping; it goes to the pitcher who was in the game when a team took its final lead. As such, the critical component for a starter to get the “W” is how long he lasts. Obviously, skill plays a role in that. And events don’t always correlate perfectly—in a blow-out win, a thriving pitcher might be pulled early to rest his arm, while in a losing but low-scoring affair, a manager may ride his ace. In general, though, the deeper into a game that a starter pitches, the better his chances for the Win.

So our scheme is two-step:

1. How deep into a game is our 4.50 ERA pitcher expected to go?
2. What is the win likelihood for a pitcher who goes that distance?

We can get an idea of the answer to the first question by looking at the aggregate ERAs of starters who pitched games of different lengths. We collected data from 2006-2009. The answer turns out to be neat:

Outs = 27 – (2*ERA)

That is, the expected number of outs is equal to 27 minus twice the starter’s ERA. That’s a close re-statement of the trend line from this graph:


(Note: In this graph, the dependent variable is actually Outs Lasted; I’ve rotated the graph for our purposes.) By this rule of thumb, a 6.00 ERA starter would give us 15 outs (5 innings), a 0.00 ERA starter would go nine innings, and our 4.50-ERA guy can be expected to last six innings.

That answers the first part of our question. Now: All things equal, what’s the win likelihood for a six-inning start? Here are the numbers from 2006-09:


Win likelihood goes from roughly 1-in-4 in a five-inning start to greater than 9-in-10 (but well less than 100%) for a nine-inning start. (We removed the data points for 25 and 26 outs because there are so few examples.)

The graph is actually slightly better fit by an exponential curve than by a straight line:


A non-linear curve is reasonable: Holding a lead deeper into a game not only deprives the opponent of more chances to recover but also lets the starter hand off the game to his better relievers.

If we read a six-inning start from the trend line, we find that our 4.50 ERA starter has a 35.5% chance of getting the win. (This is not considering the efficiency of the two offenses or of the bullpen.)

That’s informative, but the exponential curve of our second graph has a deeper implication, which is that you really want a starter who can go deep into games, even if he also sometimes flames out. Consider two starts of a combined 14 innings: A starter who goes seven innings in each start has an expected total in the two games of 0.98 Wins, whereas a starter who goes five innings in one start but nine innings in the other has an expected total of 1.18 Wins—a fifth of a Win more. In fantasy, erratic genius pays.

Does this genius have an identifiable quality? What skill or skills are good for the long haul? To find out, we calculated the aggregate K/9 and BB/9 of all starters who reached X or more outs in a game:


The above is not a typical line graph; instead, the line here tracks the shifting skills exhibited in starts of increasing length. The path consists of 28 points—from 0 outs (shown by the orange dot) to 27 outs (shown by the green dot). Each point gives the strikeout and walk rates posted by starters who recorded that many outs or more. (The blue dot marks 21 outs—the end of the seventh inning.)

The graph has two sets of arcs. The main arc sweeps from the upper right to the lower left, from the first out to the last. There are a number of fascinating aspects. For one thing, there’s little distinction in strikeout rate from basically the start of the game to the seventh inning; at each step, we’re looking at 6.3-6.4 K/9.

In starts that last more than seven innings, strikeout rate does takes a turn—but lower, not higher. Starts that go into the eighth inning are characterized by an overall strikeout rate of just 6.2 K/9. And the rate in nine-inning starts is well below 6.0 K/9.

A higher strikeout rate is no help to going deeper into games. In fact, to reach a start longer than seven innings, it’s a downright hindrance. One reason is certain: Strikeouts cost pitches, generally more than the number required to post an equivalent number of non-K outs. We also wonder if, even apart from the ballooning pitch count, power pitchers lose control earlier in the game than do finesse pitchers.

What is important—far more important—for pitching deep into a game is a low walk rate. We can show this more clearly in a second graph, this one a straightforward plot of K/9 and K/BB versus Outs Reached:


In nine-inning starts, K/BB nearly eclipses K/9!

Now, there is some backwardness here: The data show that long starts are characterized by high K/BB, not that pitchers with high K/BB are fated to pitch long. Still, I think the point holds.

Fantasy leaguers love strikeouts. And it’s a fair point that a high strikeout rate can contribute to a high K/BB. However, these graphs say that a low walk rate deserves allegiance on its own. A walk rate under 2.0 BB/9 is a sign of not only control but also good health and, within a game, an absence of fatigue. And it’s the pitcher’s walk rate, not his strikeout rate, that determines whether he will stay in a game in which he is pitching well.

If you’re scouting for wins, you should leap at the chance to roster a starter with a 2.5 K/BB, even if he has a sub-6.0 K/9. What he costs in Ks, he could recoup in Ws.

(What about the second set of arcs? Those are the epicycles, the curious switchbacks in which strikeout rate creeps up and then suddenly drops. The drops mark the start of new innings. The decision to send a starter who has completed N innings back into the (N+1)th inning relies on factors other than his arm—whether the game is close, whether the bullpen is fresh. However, once a starter has been put in for another inning, then whether he survives to the end of that inning does depend, marginally, on his K rate.)

Print Friendly
 Share on Facebook0Tweet about this on Twitter0Share on Google+0Share on Reddit0Email this to someone
« Previous: Top 100 Fantasy Baseball Prospects – 5/1/09
Next: And That Happened »


  1. Derek Carty said...

    Very nice work, John, as usual.  I found the finding about long starts being characterized by lower-than-normal strikeout rates particularly interesting.  David Gassko wrote an article here (http://www.hardballtimes.com/main/article/the-kazmir-conundrum/) that basically found that strikeout rate has no bearing on the pitch count.

    Perhaps it’s that pitchers who go deep into games exhibit their usual K skills in the early and middle innings but simply tire by the eighth and ninth.  Do you happen to have data on the pitch counts for complete games?

    Or maybe they change their approach by pitching to contact more in the later innings, lowering their Ks, lowering their BBs, increasing their BIP, and relying on some luck to let them finish the game.  If that is the case, we might be seeing a little selection bias in that pitchers who do not receive that luck on BIP won’t finish the game.  After a few hits and a run or two in the seventh they might just get yanked.

    Very interesting stuff.

  2. Derek Carty said...

    To clarify a little, perhaps the pitchers change their approach because they’ve come this far and want to finish the game and are under the impression that pitching to contact will lower their pitch count.  Just my speculation, though.

  3. KY said...

    I use pitch efficiency in my fantasy analysis.  Basically I don’t care how the pitcher gets deep into the game, I just care that they do.  So I look at who throw the fewest pitches per inning and who has the most innings per start and things like that.  Since we know you are more likely to get a win the deeper you go, we really just care about who the guys are who go deep into the game.  Of course, deep with a low ERA is even better.  I recall Brandon Webb is usually at the top of the list.  I find Aaron Cook is often undervalued in this ability also and is a great guy to own in NL only leagues.

  4. Derek Carty said...

    We must be careful, though, KY, because those things can be misleading.  A pitcher with a .240 BABIP is going to go deeper into a game (and thus have a better-looking PITCH/IP and IP/GS) than a similarly skilled pitcher with a normal .300 BABIP.  While his PITCH/IP might look better, this is coming through no skill of his own.

    I guess my point is that there is a lot of noise in those stats, and while all we care about are the results, we must focus on the process to make sure that we are going to get the results that we are expecting.  The ‘how’ is actually the most important part, which is what John’s article was trying to determine.  A flawed process will lead to incorrect results.

  5. John Burnson said...

    I echo Derek’s comments—you need to attend to the ‘how.’ Starters who last 6 innings because they’ve given up only 4 hits, or 7 innings with 5 hits, are VERY seductive. String together two or three of these starts, and the pitcher should order a second bandwagon. But you need to examine the peripherals.

    In truth, if a starter gives up 2 fewer hits than innings, that should be a red flag—not because his success is undeserved necessarily but because it is going to be so dang hard to resist his low ERA.

    Cook is a good case. Last year, Cook had 9 starts with 5 hits or fewer. Here were his hit totals in the subsequent starts:

    8, 4, 10, 10, 10, 11, 5, 9, 10

    That’s life with a 3.6 K/9. From a command perspective, 2008 was Cook’s peak, but even then he finished with only 2.0 K/BB.

    I know that Cook’s high GB% adds a wrinkle, but wrinkles aren’t magic. We might cut some slack for Hall-of-Famers, but everyone else needs to play by the rules: Strike out many more men than you walk, and walk very few.

  6. KY said...

    I agree too.  Mostly I just meant its not important to win prediction if they do Cook style or K style as the article tries to examine.  If they have a true talent skill that lets them get deep into games that’s all we care about.

    Also, my examinations for fantasy are usually to start the year.  So we’ve got a years worth of data or more.  For example here is last years NL only top outs/100 pitches leaders with 100+ ip.

    last   first
    Maddux   Greg
    Wainwright   Adam
    Hudson   Tim
    Sampson   Chris
    Bush   Dave
    Lowe   Derek
    Haren   Dan
    Maholm   Paul
    Sabathia   C.C.
    Oswalt   Roy
    Kuroda   Hiroki
    Cook   Aaron
    Sheets   Ben
    Nolasco   Ricky
    Moehler   Brian
    Hamels   Cole
    Webb   Brandon
    Santana   Johan
    Pineiro   Joel
    Lilly   Ted
    Harden   Rich
    Olsen   Scott
    Duke   Zach
    Lincecum   Tim
    Johnson   Randy

    These pitchers are either very good or walk few.  Maddux throws more strikes then almost anyone after all and it has annually gotten him wins for years.  Its just one component of an analysis, and you must examine it for flukes but I find it useful.  You see a lot of underrated pitchers above along with the big names.  Lesser pitchers on this list are usually cheaper in NL only leagues yet they often provide surprising win totals and rarely truly kill ERA because they throw strikes.  These are also great pitchers to target if they suddenly get a good defense behind them.  Harden is kind of funny on this list now given his exploits thus far in pitches per inning!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>