Predicting reliever wins

There may be no more frustrating task in fantasy baseball than trying to predict how many wins a pitcher will accumulate. Fantasy writers constantly talk about how fickle wins are and how chasing wins is a fool’s errand. This may be true, but I found myself wanting to know more about predicting wins when I joined the Yahoo! Friends & Family expert league this year. Not just any wins, mind you, but the trickiest of all: reliever wins. I wanted to know which relievers had the greatest chance of vulturing wins. Could we have predicted Tyler Clippard‘s 11 reliever wins last year, or were they a complete fluke?

Predicting reliever wins

In an effort to see which relievers were more likely to win games than others, I put together a data set with a 14 stats that I thought might be relevant. I then ran a correlation test between each variable and the number of wins a reliever accumulated in a given year. I also ran them against the number of wins per game played for each reliever because for those in daily leagues where you can simply pick up a new reliever every day, a reliever’s seasonal total doesn’t matter—only whether he might get one that day.

Here are the variables I choose:

  • Appearances minus games started (G)
  • Innings pitched (IP)
  • Innings pitched per appearance (IP_G)
  • Pitcher handedness (HAND_T)
  • Earned run average (ERA)
  • Run average (RA)
  • xFIP
  • Innings pitched per game by team’s starters (SP_IP)
  • Runs allowed per game by team’s starters (SP_RA)
  • Runs scored by team’s offense (TEAM_RUNS)
  • Team run differential (RUN DIFF)
  • Saves (SV)
  • Average leverage index when pitcher enters the game (gmLI)
  • Number of relief pitchers with a gmLI higher than the pitcher’s gmLI minus 0.1 (HIGHER_gmLI)

Using all relievers since 2004, we get more than 2,700 pitcher seasons to work with. Here are the results:

+-----+------+------+------+-------------+
|     | IP   | G    | gmLI | HIGHER_gmLI |
+-----+------+------+------+-------------+
| W   | 0.75 | 0.73 | 0.47 |       -0.43 |
| W/G | 0.31 | 0.27 | 0.31 |       -0.23 |
+-----+------+------+------+-------------+

+-----+-------+-------+-------+--------+------+
|     | xFIP  | ERA   | RA    | HAND_T | SV   |
+-----+-------+-------+-------+--------+------+
| W   | -0.29 | -0.22 | -0.13 |   0.08 | 0.26 |
| W/G | -0.18 | -0.16 | -0.10 |   0.09 | 0.07 |
+-----+-------+-------+-------+--------+------+

+-----+----------+-------+-------+-----------+-------+
|     | RUN_DIFF | IP_G  | SP_RA | TEAM_RUNS | SP_IP |
+-----+----------+-------+-------+-----------+-------+
| W   |     0.05 | -0.07 | -0.06 |      0.04 |  0.03 |
| W/G |     0.04 |  0.03 | -0.03 |      0.02 |  0.01 |
+-----+----------+-------+-------+-----------+-------+

What we see is that the most important factor is the number of innings a guy throws. Even when we’re looking at W/G, if a guy is trusted to throw a lot of innings in general, he’s going to be trusted to win games.

Just as important in terms of W/G is gmLI. This makes complete sense as a high gmLI means that the game is close, late, or in question. If a pitcher is trusted in these situations, he’s going to be in position to pick up the win frequently. Similarly, if a team has more than one reliever who is trusted in these types of situations (think San Diego with Luke Gregerson and Mike Adams), each one’s win total will be cut into by the other. After that, we see that how the reliever actually performs is most important, with (surprisingly) xFIP beating out RA and ERA.

At this point, our results start bordering on irrelevant with r-squared of 0.01 or below and high p-values. Then a pitcher’s handedness comes into play (right-handed is better), followed by whether he’s a closer, and then the rest of our stats that don’t mean very much. Of note is that the reliever’s starting pitchers and his team’s offense have almost no bearing on whether he picks up a win. So when deciding between Rafael Soriano and Tyler Clippard, it’s not going to matter much that one plays for the Bronx Bombers and the other plays for the lowly Nats.

Strategic implications for Yahoo! Friends & Family

The whole point of this exercise in the first place was to aid my team in the Yahoo! Friends & Family expert league, so I might as well explain why that was the case. In Yahoo! F&F, there is a 1,250 innings cap. This means that, essentially, if your team reaches that maximum, every pitcher’s W/IP effectively contributes the same amount towards total wins.

That is, what you essentially must do is maximize your W/IP. Because if everyone reaches 1,250 IP and can’t accumulate any more, everyone’s win total is going to be equal to 1,250 x team W/IP. So if Cliff Lee posts an 0.75 W/IP (15 wins in 200 IP) and you’re able to put together a collection of four relievers who post an 0.8 W/IP (4 W in 50 IP each), they are effectively worth the same.

Given this, you can see how important it is to target the relievers who are in the best position to pick up wins.

2010 xW/G% Leaders

If I create a regression equation using the most important variables, I can come up with a formula that will give us a reliever’s expected wins per game played. Based on that formula, here are 2010′s top 20 relievers in terms of expected wins per game.

+------------+---------+-------+--------+
| LAST       | FIRST   | W/G   | xW/G   |
+------------+---------+-------+--------+
| Bard       | Daniel  |  1.4% |  10.7% |
| Belisle    | Matt    |  9.2% |   9.8% |
| Gregerson  | Luke    |  5.0% |   9.8% |
| Clippard   | Tyler   | 14.1% |   9.7% |
| Adams      | Mike    |  5.7% |   9.6% |
| Berken     | Jason   |  7.3% |   9.0% |
| Betancourt | Rafael  |  6.9% |   9.0% |
| League     | Brandon | 12.9% |   9.0% |
| Masset     | Nick    |  4.9% |   9.0% |
| Guerrier   | Matt    |  6.8% |   8.9% |
| Romo       | Sergio  |  7.3% |   8.9% |
| Camp       | Shawn   |  5.7% |   8.9% |
| Moylan     | Peter   |  7.0% |   8.8% |
| Loe        | Kameron |  5.7% |   8.7% |
| Jepsen     | Kevin   |  2.9% |   8.7% |
| O'Day      | Darren  |  8.3% |   8.7% |
| Crain      | Jesse   |  1.4% |   8.6% |
| Benoit     | Joaquin |  1.6% |   8.6% |
| Hensley    | Clay    |  4.4% |   8.6% |
| Perry      | Ryan    |  5.0% |   8.6% |
+------------+---------+-------+--------+

Naturally, this will change for 2011, but it will give you a decent idea of which relievers are worth targeting. Some potential changes to the list include Jason Berken, who will surely drop with Koji Uehara and Mike Gonzalez back in the mix in Baltimore. Rafael Betancourt could drop a little with the addition of Matt Lindstrom (but should still stay very strong). Kameron Loe will drop with the addition of Takashi Saito. Joaquin Benoit could improve going to Detroit (and Ryan Perry could drop). Nick Massett could drop with Aroldis Chapman now in the majors.

Also of note is that it is possible for two relievers from the same team to rank highly as long as both are highly skilled, pitch a lot of innings, and pitch very high leverage innings, as evidenced Gregerson and Adams both making the top five.

Concluding thoughts

If anyone has questions or would like thoughts on a specific reliever you’re considering, feel free to comment or e-mail me. I’ve also joined Facebook, so now you can add me as a friend and catch up with me there too!

Print Friendly
 Share on Facebook0Tweet about this on Twitter0Share on Google+0Share on Reddit0Email this to someone
« Previous: Five questions: Pittsburgh Pirates
Next: Five questions: Oakland Athletics »

Comments

  1. Mike L said...

    On the 2010 xW/G% Leaders, only two pitchers exceeded their xW/G%, and most fell far short of it. The actual W/G% average rate was 6.18% for pitchers on the list, while the xW/G% average rate was 9.08%.  Is there an easy way to reconcile that difference?

    Also, I am curious to see what the expected wins per inning is for starting pitchers and relief pitchers. For innings cap leagues, W/IP is what really matters.

  2. Tom B said...

    Tryin to predict wins is insane. 

    I did a roto league last year where I led the league in ERA, WHIP, K’s and K/BB. 

    I was dead last in wins.

  3. Derek Carty said...

    Mike L,
    I think that’s mostly coincidence.  The spread among all pitchers is much narrower.  Also, the r-squared for the equation was just 0.16, which tells us something valuable but which isn’t incredibly strong (after all, we’re still dealing with wins), so perhaps that has something to do with it.

    The list was more just to give you an idea of the types of guys who do well, not to be anything incredibly scientific.  If you want, subtract 1.5% from xW/G to bring it in line with actual W/G.

  4. Derek Carty said...

    Tom B,
    Yeah, wins are incredibly hard to predict, but not impossible.  It’s better to have some idea about how to predict them to go in completely blind and make wild guesses.  If we can predict wins with 20% accuracy, or 10%, or 5%, it’s better than 0%.  I know it can be frustrating when you have a situation like yours, but it doesn’t mean that will always be the case.

  5. Derek Carty said...

    Oh, and Mike L, I didn’t try to do SP wins because I think that would be a lot more complex and would need an actual forecasting engine to do.  For a RPs, gmLI is one of the primary drivers, it appears, but for SPs, gmLI is completely irrelevant.

  6. Nate said...

    What about Rafael Soriano this year now that he is the alpha setup guy for the Yanks? Would your findings support the idea that he could be in for a jump in relief Ws even though he will obviously pull in A LOT fewer SVs?

  7. Derek Carty said...

    Definitely.  There shouldn’t be much competition for gmLI.  He’ll be the primary 8th inning guy, should accumulate a lot of innings (on a per game basis – on a yearly basis he’s riskier since there are some lingering healthy issues), and has very good skills.  He would be a top tier option in my opinion.

  8. Tom said...

    Great story!  Can you use your model plus projections for 2011 to give us a w/g ranking projection for 2011?

  9. David said...

    Shouldn’t you include holds and blown saves?  Between these and saves, that’s an indication of how often you’re brought into the game where your team is leading, meaning it’s tougher to get a win.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *