Talking Situational Wins

Are there situations in which stolen bases are more valuable? (via Keith Allison)

Are there situations in which stolen bases are more valuable? (via Keith Allison)

I am totally fascinated by WPA/LI, even though I can’t really tell you what it is. The title says what it is: Win Probability Added divided by Leverage Index, but that doesn’t really help. We are probably better off calling it Situational Wins, which is a less geeky name for a very geeky concept. I tried my best to describe Situational Wins in this article (see the end), in which I called it “a number that indicates who ‘won’ the at-bat, and by how much.” Hence, the name. Situational Wins.

A vague explanation isn’t the only issue with Situational Wins. Here are some others.

  • Some people just don’t like WPA, and using WPA/LI (which corrects some of the things people don’t like about WPA) feels like jumping further into the rabbit hole.
  • Total WPA/LI for a team doesn’t equal the team’s won/loss record. WPA does (which is one of WPA’s main attractions).
  • Because of rounding and the infrequency of some situations, you sometimes get results that are a bit off.  I’ll show you a couple of examples below, but these small differences shouldn’t make much difference for players over the course of a season. Still, they exist.
  • WPA/LI is calculated by comparing two tables that Tangotiger has derived: WPA and LI by situation. We all trust Tango,  but are we certain his tables are completely correct?
  • WPA/LI has a bias in favor of home runs, because not all baseball events are randomly distributed across situations (personally, I don’t think this is a bias; it’s just a fact.  But I thought I’d mention it.)

Having said all that, my intuition is that WPA/LI works. To me the proof is in the postseason scenario (as described in my previous article) in which the added “championship value” of winning a game has a constant relationship to its leverage index. As a result, when you divide each game’s championship outcome by its leverage index, you get the same number.  Each game is equal in importance. Really, read the article.

Situational Wins do the same thing to plate appearances.  They ensure that each plate appearance is treated the same as every other plate appearance regardless of how important it is to the game.  The result is a measure of how successfully the batter or pitcher approached the situation.

This is not just an exercise in baseball math, by the way.  Situational Wins are useful; they do something few other stats do. To quote Tango:

The key point of Situational Wins is best described in this extreme situation: with the bases loaded, tie game, bottom of the 9th, Situational Wins (and WPA) are the ONLY metrics around that will give equal contributions to the walk as it does to the homerun.

So what’s stopping us from using this useful stat? To me, there is a bottom line issue with Situational Wins. When you neutralize the criticality of a situation (divide by Leverage Index), what’s left? When you take out the critical impact of the score, inning and base/out situation, what is left for WPA/LI to consider? What are the key elements of the “situation” in Situational Wins?  And how are they calculated?

Today, I’d like to experiment with Situational Wins to see if we can get a better handle on this issue. I plan to calculate the WPA/LI of several different situations and outcomes just to see what we see. I will discuss the results and encourage you to add your own comments. By working together, perhaps we can come to a better understanding of what this elusive stat is.

I’m going to use the WPA Inquirer for my results, which you can find below (you can also find it at our WPA Inquirer page). You’re welcome to play with the WPA Inquirer, add your own scenarios to the conversation and leave them in the comments.

In all cases, I will assume a run environment of 4.5 runs per game. All values will be expressed from the batting team’s perspective but we will also try to incorporate the pitching team’s perspective.


First up: What happens when the score changes?

Value of Leadoff Event in Bottom of Seventh
Home Team Score Diff Out Single Double Home Run
Down by Two -0.025 0.044 0.065 0.094
Down by One -0.026 0.040 0.069 0.119
Tie Score -0.025 0.037 0.072 0.135
Up by One -0.025 0.038 0.071 0.131
Up by Two -0.026 0.038 0.072 0.134

An out has the same negative impact regardless of the score. How can this be?  Doesn’t an out hurt the team more when you’re behind than when you’re ahead? Yes it does, but that is what WPA captures.  Situational Wins count a bases-empty out as the same value, regardless of the score, inning or number of outs. Bases-empty outs are the constant in the Situational Win universe.

Try it yourself.  Plug a bases-empty out into the Inquirer in lots of different situations to see what you get.  Remember that one- and two-point differences are insignificant and that rare opportunities are affected by rounding in the tables. As an example, try a seven-run lead in the bottom of the eighth.  The Leverage Index is so low in that situation (0.01) that you know it’s rounded.  The WPA/LI value won’t line up as well.

Anyway, does this make sense?  Should bases-empty outs have the same value regardless of the situation? Well, when a pitcher gets a bases-empty out from a batter, he has won the contest and moved the game clock forward by the same amount each time. Situational Wins should be consistent here, and the fact that WPA/LI works this way is a confirmation of the system. Things are different when there are runners on base; I’ll explore that in a minute.

There’s another wrinkle here: a single is valued more highly when a team is down, but a double and home run are valued more highly when a team is tied or ahead.  Playing with the WPA Inquirer, we find that this effect is much less pronounced in earlier innings and more pronounced in later innings. So game time is a factor here, but the trend is the same across all innings.  What’s up with that?

Remember that we’re talking about leadoff situations in this example. Hitting a single at the beginning of an inning when you’re down by two is better than when you’re ahead because you need to score runs. By hitting a single, you keep the inning going and set up the potential for more runs.

On the other hand (from the pitcher’s perspective), the output is saying that giving up a solo, leadoff home run is more harmful when you’re behind and trying to catch up than when you’re ahead and trying to maintain a lead. The same is true for a double, but to a much lesser extent.

I’ve thought about this a lot, but I haven’t been able to find the ideas or words that tease out what is going on here. Why does the single follow a different pattern than the extra-base hit?

Next, let’s first see how Situational Wins change when the number of outs change:

Value of Event with Runner on Second in the Top of the Fourth of a Tie Game
# of Outs Out Without Moving Runner Bunt Runner to Third Run-scoring Single Home Run
Zero -0.034 -0.011 0.058 0.107
One -0.028 -0.024 0.069 0.127
Two -0.028 -0.028 0.082 0.154

First of all, the negative impact of not moving the runner over to third is larger with none out than with one or two out. This makes sense, of course, because a runner on third can score on a sacrifice fly with one out but not with two out.  Conversely, bunting the runner to third with none out (which is a negative play nonetheless) is a much better play than doing the same thing with one or two outs. Once you think about the sacrifice fly, this will make perfect sense to you.

On the other hand, the value of a positive batting event, such as singling in the runner or hitting a home run, increases as the number of outs increases.  This makes sense, too, because teams have more time to score runners when the inning is still young. When the inning is running out, however, run-scoring events are more meaningful.

Think of it from the pitcher’s point of view. With a runner on second and no outs, you sort of expect that some runs are going to score.  But with a runner on second and two outs, you’re hoping no runs will score. So giving up the run-scoring hit hurts more–is a worse result for the situation–than earlier in the inning.

Finally, how do things change as the game progresses?

Value of Event with Runner on First, One Out in Tie Game in Top of…
Inning Stolen Base Caught Stealing Double Play Single Runner to Third
First 0.014 -0.035 -0.044 0.055
Third 0.014 -0.034 -0.043 0.054
Fifth 0.014 -0.034 -0.043 0.055
Seventh 0.016 -0.034 -0.044 0.056
Ninth 0.019 -0.034 -0.045 0.061

Once again, outs have the same value regardless of the inning. The new insight is that these aren’t outs that occur with no one on base, but outs that finish with no one on base.  So we can update our previous finding to say that all outs that finish with no runners on base have the same value. Of course, caught stealing has a bigger negative value than a bases-empty out because it eliminates a baserunner.

Now positive offensive events, such as stolen bases and singles, go up a bit in the ninth inning (and a bit less in the seventh). Similar events in Win Probability Added increase dramatically in the late innings of a game. It appears that the WPA/LI value of positive batting events also increase in the late innings of a game, but to a lesser extent.

This is another tough one for me to tease out. It’s obvious that batting events have more positive value as the game clock runs down. Is that all that is going on here?  If so, how is this quantified in a way that differentiates it from WPA?  If not, what else is being considered?

I’m afraid I’ve posted more questions that answers today.  Hopefully, this will be one of those articles in which the best insights are left in the comments.  Got any?

References and Resources

Kincaid’s article about WPA/LI is worth reading because it demonstrates that the average WPA/LI values of events line up well with their linear weight values.

Print Friendly
 Share on Facebook0Tweet about this on Twitter10Share on Google+0Share on Reddit0Email this to someone
« Previous: In Defense of Jeffrey Loria and the Marlins
Next: Peephole Into the Postseason: Winners vs. Winners »

Comments

  1. tz said...

    THIS IS THE ARTICLE I’VE BEEN WAITING FOR!!!!!!!

    Seriously, I’ll put my first comments below before I have to tend to my real job duties.

    1. You got me to totally reconsider WPA/LI in your first article by showing that WPA/LI is a normalized measure of how optimal one outcome is relative to the full set of possible outcomes for that situation.

    2. For the differences in impact between singles and extra-base hits based on situation, let me try this explanation:

    – Extra base hits greatly increase the probability of scoring AT LEAST one additional run (call this Pr[X>=1])much more than singles do. In particular, HRs have a certainty of adding a run (Pr[X>=1] = 1)

    – In a tie game or with a lead, the increase in WPA for adding just one run is relatively large. So the higher value of the Pr[X>=1] from an extra-base hit increases its situational value relatively more than a single.

    - When a team is trailing, the increase in WPA for adding just one run is relatively smaller. This is driven by the fact that you have to outscore your opponent by your current deficit plus one run. So WPA is more aligned with the expected number of additional runs scored (call this E[X]). As a result, the relative values of any kind of non-outs will be closer together than normal when leading off an inning while trailing. (This is consistent with the call of a Little League manager to his trailing team “We need baserunners!!”

  2. tz said...

    Oops on the bolding for the comment above.

    I’ll add a couple more before I return at lunchtime:

    3. The HR “bias” for WPA/LI might explain the “anti-bias” present in the “Clutch” stat (∑WPA/∑LI minus ∑(WPA/LI)).

    4. We should add a couple significant figures to Tango’s tables to improve normalization of low leverage situations.

    5. The biggest problem with “Clutch” is not the WPA/LI part of the stat, but the use of “average” LI the denominator of the other component. You can get some very messy inconsistencies over small sample sizes, and the non-linear nature makes “Clutch” non-summable from splits. See Hector Noesi’s Clutch scores as an example:

    http://www.fangraphs.com/statss.aspx?playerid=3292&position=P#winprobability

    6. I LOVE the “move the clock forward” analogy. Because WPA/LI and linear-weights stats share the fundamental currency of outs, we could calculate a “Situational Hitting” adjustment to WAR equal to (WPA/LI minus Batting Wins Above Average).

    7. And on that note, I do believe that “Situational Hitting” should be part of a player’s measured value. The difference between driving in that runner from third with one out and not is important enough that hitters will adjust to increase the chance of an outfield fly (and pitchers will likewise aim for a strikeout, or at least a groundball). WPA/LI appears to be the best way to include this skill in a “sequencing-neutral” context.

    • tz said...

      Thanks Dave for posting the link, and Tango for your comments.

      Question – I know that WAR is based upon batting runs above/below average using linear weights, but then converting from runs to wins based upon the overall run environment. Would the “batting wins” component be any different if we used linear weights built from average WIN expectancy for each event (and then skipped the runs-per-win factor)?

      I’d imagine that “walk-off” type scenarios (where any non-out wins the game) would tend to drive the value of all non-outs closer to one another, but I’m not sure that those and similar scenarios occur often enough to cause the runs/win for a single to differ materially from the runs/win for a HR.

  3. Detroit Michael said...

    Can we put some human faces and context on this weird WPA/LI statistic? Compared to oWAR for example, are there some players (and who are they) who over their careers do significantly better in WPA/LI compared to oWAR or consistently worse? In other words, do some players seem to perform better or worse in the situation than we might suspect by just looking at their career batting lines? Is the distribution of such players wider than we would have expected to occur through randomness?

    It’s a hard statistic to deal with, so maybe showing us that it makes a difference, that without it we are under or over valuing certain players, may make us want to work harder to use WPA/LI.

    Just a suggestion — obviously it’s up to you what to write. I typically enjoy any article written by Studes.

  4. Steven said...

    I went back and read the previous article and in the comments there was some confusion with the Trout WPA/LI example. To refresh, Trout’s best hit was a homer with 2 outs and no one on. Simon did not like the fact that a HR with 2 outs and 2 men on was worth more than a HR with 2 outs and no one one when the player did not have anything to do with those two runners getting on. Then you said “It’s not at all clear to me that WPA/LI would rate the (3-run HR) as a better hit.” So I went ahead and attempted this example. If I did it correct, I have a solo home run in the first with two outs in a tie game being worth .248, but the 3 run home run only being worth .182 (the reverse of Simon’s concerns). So in trying to understand how to use WPA/LI, this example is making it difficult. Can anyone help me understand why “punishing” (for lack of a better term) the hitter for homering with runners on is okay?

    • said...

      That’s easy, Steven. Because, with no one on, a home run is the perfect thing to do. Other things are good, but a home run is awesome. With runners on base, there are other good alternatives available to the hitter. So the home run isn’t as powerful a response to the situation.

      thanks for following up on the question!

      • Steven said...

        Okay, I understand that. But then how does one use WPA/LI with this in mind. It seems to me that if I just use total WPA/LI to compare players, I am going to give more credit players who come up with no one on more often. Is this not the case, or should I not use WPA/LI like this?

      • Tangotiger said...

        WPA/LI guarantees that each PA is EQUALLY weighted. That means bases loaded and bases empty count the same.

        A .400 OBP with bases empty and a .400 OBP with bases loaded is still… a .400 OBP. So, WPA/LI gives you that.

        What you may want is something that counts bases loaded more than bases empty. And that’s what WPA and that’s what RE24 is for.

      • said...

        I wouldn’t say that someone who comes up with no one on more often will get more credit with WPA/LI. But someone who homers with no one on and two out more often than someone else may. OTOH, someone who singles more often with two out and no one on gets discredited. That is what WPA/LI does.

        Contrast the single vs. the home run in both situations. A HR is worth more with no one on and two out (than with a runner on second and two out) but a single is worth less. Each hit is more “appropriate” in opposite situations.

        Think of it from the pitcher’s perspective. If he gives up a home run vs. a single with a runner on second in a tie game, what will be the relative perception of how he did compared to giving up a HR vs. a single with no one on and two out in a tie game?

        I think the answer is kind of obvious.

        Having said that, the differences are not so extreme that you have to worry about significant differences in opportunities between players. As Tango said, WPA/LI treats each plate appearance equally and the relative impacts are MUCH less extreme than they are for WPA.

  5. Steven said...

    For example, currently Edwin Encarnacion has hit 58% (11 of 19) of his home runs with runners on. I would imagine this is because the Blue Jays get on a lot and EE hits a lot of HR’s (rather than him possessing some sort of men on base skill). As a result (possibly) he ranks 6th in oWAR, but 20th in WPA/LI. If he would have just hit some of those HR’s with the bases empty instead of with runners on (switch it to 11/19 with bases empty, closer to the league average of 56%) then his WPA/LI would be close to, if not in, the top 10.

    • said...

      Thanks, Steven. Good observation (assuming your math is correct). While WPA/LI treats all plate appearances equally, there may be a “mismatch” between that player’s output and the situations he sees (compared to more straightforward measures). Whether or not that is important depends on what it is you are studying.

      I definitely don’t recommend that you “just use WPA/LI to compare players” (your words).

  6. Tangotiger said...

    With bases empty, the ratio of the HR impact to the 1B impact is very large.

    With runners on base, the ratio of the HR impact to the 1B impact is only large.

    That’s what WPA/LI is capturing.

    To the extent that you can accept and appreciate that each PA is equal, whether there are runners on base or not, then the relative impact of the HR to 1B is larger with bases empty than with runners on base.

    • tz said...

      Perfect!

      Seriously Tango, this should be a standard footnote to any post on WPA/LI or its related metrics. Captures the bottom line of what it measures and why it’s relevant simply and succinctly.

      Thanks!

      • said...

        Right! Here’s how I’m starting to think of it:

        1. If you think each event should be treated equally, independent of the situation, use linear weights.
        2. If you think each event should be weighted by runners on and outs of each situation, use RE24.
        3. If you think each event should be weighted by all the game factors of each situation, use WPA
        4. If you think each situation should be treated equally, use Situational Wins

        All of these are legitimate. Your choice depends on what you are studying.

  7. Tangotiger said...

    I brought up Situational Wins a few years back, and I’ve had several threads on it. I can basically get one person converted at a time, each time I bring it up. It’s a really hard concept to follow.

    I think Studes here has done a tremendous job at converting a few people out there with this article.

    Basically, the way to sell this thing is to keep trying new and different ways to explain it. We have NO IDEA which will finally click with the saber-masses (those already hugely inclined to get on the saber-train), so we are in trial-and-error mode here.

    • tz said...

      Based on my experience as a recent convert, the key is making sure that folks understand exactly WHAT that stat is designed to do, and hammer it home with examples. Before I’d read one of Studes’ articles on Situational Wins, I just thought that WPA/LI was nothing more than an intermediate step in calculating “Clutch”.

      Taking Situational Wins to the saber-masses is probably parallel to taking WAR to the overall baseball-fan-masses. Lots of the objections to WAR come from distrust of the intricacy of the calculations (and sometimes lack of apparent transparency). Getting folks to at least know the actual INTENT of WAR (a fair scale to estimate how more much a major-leaguer contributes than a bottom-of-the-roster guy) might get them to buy in even if they don’t yet 100% grasp all the technical details.

      • Tangotiger said...

        I introduced WPA/LI (aka Situational Wins) on my blog in a very transparent way, step by step. There was nothing about “clutch” discussion there. So, I had the target audience for it, and I wasn’t able to describe it well enough.

        If you thought of the metric in terms of some intermediate step toward Clutch, then this just means you’ve been out of the original loop. Though I’m glad you are in the loop now.

        Studes was probably the most interested of all the readers, the one who was able to best understand it, and still, he still hasn’t grasped all the nuances. This is actually both a compliment to studes, as well as to show how far away we are of being able to explain the metric.

        It’s a tough tough metric to appreciate, but you will be rewarded once you “get it” fully. It’s not something to just read once, and walk away from. An equivalency would be to learning an instrument, and needing to do the lessons every week, until you nail that opening to Smoke on the Water as if you’ve been playing it all your life. So, you really need dedication here.

  8. tz said...

    I was definitely out of the original loop – my sabermetric reading has a huge gap running from the Bill James Baseball Abstracts until a few years ago (coincides with my kids first eighteen years). Fangraphs has been the gateway for me to dive into more details, so for example it was only yesterday that I first read your original blog on WPA/LI. Better late than never….

    My next “to-do” is to bookmark your post on Game-State wOBA for more careful reading than I can do during the workday. From my lunchtime scan of that post, there’s plenty more food for thought as I grapple with this metric.

    Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>