The One About Win Probabilityby Dave Studeman
December 27, 2004
I talk a lot about Win Shares, because they do something I think is really valuable -- they estimate the contribution each player has made to his team's wins. This is an entirely different way of thinking about players, stats and value -- because it measures every baseball event within the context of the ultimate goal: winning games.
But Win Shares are not the only way to skin this cat. There is another process that goes by many names and has been "introduced" to the public many times. In fact, a 2003 Business Week article claimed that "a novel way to evaluate baseball players" had been invented by people who wanted to bottle, patent and sell it. Too late. It was first introduced by the Mills brothers in the early 1970's, and it's been done many times since.
As I said, it goes by many names: "Player Win Averages" (Mills brothers), "Player Game Percentage" (Bennett), "Win Probability Added" (Drinen), "Win Expectancy" (Tangotiger), "Game State Wins" (Rhoids website), "Player's Win Value" (Ed Oswalt) and WRAP (Lonergan and Polak). I'm sure there are other people who have done the same thing and given it a name that I've not acknowledged here. I apologize if you're one of those people.
For purposes of this article, I'm going to use the Drinen name, "Win Probability," because I think it's the most descriptive.
Here's the basic idea. An average team, at any point in a game, has a certain likelihood of winning the game. For instance, if you're leading by two runs in the ninth inning, your chances of winning the game are much greater than if you're leading by three runs in the first inning. With each change in the score, inning, number of outs, base situation or even pitch, there is a change in the average team's probability of winning the game.
Christopher Shea has invented a "Win Expectancy Finder" to look up the actual Win Probability of every base/out, inning and score combination of all Major League games from 1979 to 1990. Chris used Retrosheet data that had been compiled by Phil Birnbaum, and his WE Finder simply looks up the percent of times a team in a given situation went on to win the game during those years. Next time you watch a ballgame, use it to track the ups and downs of the game. It will change the way you watch baseball.
Here's an example: Bottom of the ninth, score tied, runner on first, no one out. The home team has a 71% chance of winning according to the Win Expectancy Finder (in this situation, the home team won 1,878 of 2,631 games between 1979 and 1990). Let's say the batter bunts the runner to second. Good idea, right? Well, after a successful bunt, with a runner on second and one out, the Win Probability actually decreases slightly to 70% (home team won 1300 of 1,848 games), according to the WE Finder. The bunter hasn't really helped or hurt his team; his bunt was a neutral event.
If you're managing a team, or even following the game, you might want to know this sort of thing. Of course, the application of actual strategy (should he bunt or not?) depends on a lot of other factors, such as the skills of the batter, the pitcher and the baserunner, the following batters in the order, the game conditions and probably a number of other things. But Win Probability sets the baseline for evaluating each event on the field.
To really have fun with this system, you can take it one step further and track something Drinen calls "Win Probability Added" (WPA).
Once again, the concept is simple. Let's say our batter in the bottom of the ninth hits a single to put runners on first and third with no outs. This increases the Win Probability from 71% to 87%, for a gain of 16%. So, in a WPA system you credit the batter +.16 and debit the pitcher/fielder -.16. If you add up every positive and negative event from the beginning to the end of a game, you wind up with a total for the winning team of .5, and a total for the losing team of -.5. And the player with the most points will have contributed the most to his team's win.
By the way, that 87% with runners on first and third in the bottom of the ninth is on the low side for reasons I'll discuss in a minute.
If you were to track an entire season in this manner, you would have a Win Contribution metric that is more accurate than Win Shares, because it is based on how much each event actually contributed to the team's wins. In a way, WPA is the ultimate baseball statistic. And in a way, it is not.
Like Win Shares, WPA is not a good predictive statistic because it's not necessarily a good representation of a player's true talent. If a player hits a home run in the ninth inning of a 1-0 game, he is credited with more WPA points than if he hits a home run in the first inning of a 1-0 game. The talent is the ability to hit the home run; when it happens in a game is something that is pretty random. When you are thinking of acquiring a player for your fantasy team, you should rely more on the traditional sabermetric stats, like Linear Weights, Runs Created, DIPS, etc. etc.
Also, WPA measures the impact of an event while the game is in progress, not after the game is over. After the game is over, the score is 1-0, and it doesn't matter when the batter hit the home run. But during the game, it matters a lot. Good managerial strategies, for instance, are based on an implicit understanding of Win Probabilities. And if there is such a thing as clutch performance, WPA might unearth it.
The most interesting and useful application of Win Probability Added -- the one that Drinen, Tangotiger and others have spent a lot of time on -- is the evaluation of relief pitchers and the managers who call on them. We all know that closers are important, even though they pitch less than 100 innings a year, right? Why? Because they pitch the most important innings.
Using the WE Finder again, if a pitcher gives up a bases-empty home run in the first inning of a tie game, his team's Win Probability decreases about 10%. If he does the same thing in the eighth inning, it decreases about 25%, because his team has less time to come back. In this context, the eighth inning is about 2.5 times more important than the first inning. And if you apply this sort of analysis to every appearance made by a relief pitcher, you can quantify the importance of all of his innings pitched.
Tangotiger developed a system called "Leveraged Index" that measures and sums the potential "Win Probability Added" for each pitcher's appearance. Doug Drinen tracked a similar measure, called "P," in the Big, Bad Baseball Annual. Though the math behind each system differs, they are both constructed to measure the importance of relief innings. You might particularly enjoy Tango's Crucial Situations article and chart.
Win Shares, by the way, includes an approximation of WPA for relief pitchers, based on each pitcher's saves and holds. I've been playing with a similar system myself, and I hope to roll out some analysis in the next few weeks.
There are a couple of reasons the Win Expectancy Finder isn't the best source of Win Probabilities. First, it's based on the years 1979 through 1990, when there were fewer runs scored per game than in the past few years. A one-run lead was safer back then. Also, there are sample size issues with some of the situations. For instance, there were only 220 games with runners on first and third for the home team in the bottom of the ninth with the score tied. That's not a large enough sample size. So you shouldn't take the numbers in the WE Finder as "gospel."
The better way to develop your Win Probability table is to develop something called "Markov Chains." I won't go into all the math here (because I doubt I can really explain it well), but suffice to say that a proper Win Probability table is something a good mathematician can concoct, based on the probability of scoring a certain number of runs for each base/out situation.
And there are still a lot of Win Probability issues to be resolved. For instance, Win Probability tables really should be altered based on the home park. To track WPA on a regular basis, you need play-by-play data, so you can't create it for most of baseball history. And Win Probability doesn't solve the sticky issue of splitting credit between pitching and fielding (something Lonergan and Park admit).
Win Probability is a complicated subject, and there's so much more I could say. But I hope this article serves as a good introduction to a topic I plan to return to in the future.
References and Resources
Here's an example of a game I tracked with my own Win Probability tables. Cyril Morong has a nice review of Win Probability hitting stats on his website. If you're a Baseball Prospectus subscriber, you can find a table of Win Probabilities based on 2004 games in their stats section.
I recommend Alan Schwarz's "The Numbers Game" for a very nice history of the evolution of Win Probability (including the critical role of George Lindsey). My thanks go to Tangotiger, Doug Drinen and Jon Daly for their support and education in this subject.
Dave was called a "national treasure" by Rob Neyer. Seriously. Comments about this article can be sent to him through the miracle of e-mail.