What WPA can tell us

Wins Probability Added is a fun stat. As THT’s own Dave Studenman likes to point out, it’s the story stat. Designed decades ago, WPA calculates how each event within the game affects the chances of each team winning. It tells us the story of the game: how it feels, where the big moments of drama are. A home run in the ninth inning of a 19-2 game won’t chart at all; that game’s results already have been decided. A home run in the ninth inning of a tie game will have a huge impact, because it will drastically change the odds of a team winning.

Since WPA is designed to chart the ebbs and flows of a game, how the drama rises and falls, I figure that with a large enough sample size of WPA games, you can get a sense of where the drama in a game generally comes from. There are some questions that can be dug away at with a big enough sample size. Thanks to a little bit of digging on my own part, and a really big assist from Eric Johnson, I have a sample size of over 7,000 games to work with: all contests from the 2004, 2011, and 2012 seasons.

How much drama is in a typical game?

The absolute minimum WPA value of a game is 0.500. Both teams start off with a 50 percent chance of winning and each game ends with the victor at 1.000 (a 100 percent chance of victory) and the loser at zero. So if a team scores a ton of runs in the top of the first and then shuts down its opposition, the WPA will be around .500 (1.000 percent minus 0.500). The highest possible one-game WPA is infinite. You can always have another swing in fortune—and the game always can go into extra innings.

So what’s a typical WPA score? Well, the 7,287 games on file have a total WPA of 19,646.965. Thus, an average game should have a WPA value of 2.696, which is about five times the lowest possible score. No game scores the ultimate low of 0.500.

The lowest I have is 0.696, which is exactly 2.000 below the average. That’s a neat bit of symmetry there. This value came when the Red Sox torched the Indians, 14-2, on May 25, 2011. Boston scored seven runs in the top of the first to seal it away. Cleveland didn’t score until the eighth, and all their base runners in the first seven innings came with two outs, which dampens the game’s WPA score.

While no game is more than 2.000 below the average WPA score, 666 games are at least 2.000 higher, with a mark over 4.696. A game can have an infinite number of twists in it. The highest score is the June 24, 2004, Texas-Seattle contest. The Rangers won, 9-7, in 18 topsy-turvy innings featuring 35 hits, 16 walks, and five double plays. It’s WPA: 9.517. All by itself that compensates for four 0.696 games.

Okay, so if the higher-scoring games can affect the average more than the lower-scoring games can, what’s the median WPA result? Well, the 3,644th best score out of the 7,287 games is 2.497. One game had that score exactly: the Dodgers’ 3-1 interleague victory over the Angels on June 23, 2012. If you can’t remember that one, apparently there’s good reason. By WPA, it was a perfectly generic game.

Here’s a chart to show how often you get various WPA scores. Included at the side of the chart is how often you’d see this score per 162 games, to get a sense of how often you’ll see something like this each year:

WPA Score	Games	Per 162
Up to 1.000      163	 4
1.001 to 1.500	 815	18
1.501 to 2.000	1249	28
2.001 to 2.500	1425	32
2.501 to 3.000	1173	26
3.001 to 3.500	 913	20
3.501 to 4.000	 632	14
4.001 to 4.500	 599	13
4.501 to 5.000	 135	 3
5.001 or more	 183	 4

Before moving on, there’s one other item I want to check on. What are the average scores before digging deeper? You see, Baseball-Reference.com gives each team’s positive and negative scores, what marks they get for each at-bat that helps the team, and the cumulative score for every plate appearance that hurts its chances of winning.

What are the average marks when you dig down? What’s the average positive swing for a winning team and its average losing swing? What’s a typical swing for the losers?

Well, a typical winning team has an average positive good swing of +0.805, and their typical negative swing is –0.560. For losing squads, it’s +0.539 and –0.494.

When you dig into these details, the award for the most generic game of all is a three-way tie. As it happens, all three games featured the White Sox. Their 10-3 win over the Royals on Sept. 17, 2011, a 9-5 loss to the Blue Jays on June 5, 2012, and a 7-5 triumph over Boston on July 17, 2012, had their four components closest to the perfectly average score.

Are games more won or lost?

Since we have such a huge sample size, here’s one question: is the result of a game more due to the victorious team winning it, or is a contest’s result primarily caused by the failures of the losing team? In other words, where do you see the most WPA value pile up, with the winner or loser?

Well, I supposed this shouldn’t come as too much of a surprise, but it’s pretty evenly distributed. As noted above, the games have a WPA total of 19,646.965. Winning teams have a slight majority of that share, though: 50.6 percent (9,946.639) versus 49.4 for the losing team.

Winners pull away a bit more when you go game-by-game. In 3,918 contests—53.8 percent of the total—the score of winners accounts for the majority of the WPA action. That doesn’t include three games where it was perfectly even.

Again, things cluster toward the middle. Over a third of the games have both winner and loser contribute to over 45 percent of the result. Another third has both teams contribute between 40-60 percent of the game’s overall value. Once every 25 games you have a contest where one team is responsible for two-thirds of the overall drama of the game.

A Hardball Times Update
Goodbye for now.

The most one-sided game was a 14-3 Padres win over the Marlins in July of 2011. The Padres’ WPA value was nearly nine times as high as that of the Marlins. San Diego had positive WPA swings of 0.518 and negative swings of -0.059. Florida barely did anything to impact the game, though: 0.013 positive swings and –0.055. That game was 13-0 by the end of the second inning.

The most extreme case of a team losing was when the Red Sox lost, 2-1, to the Indians on May 3, 2004. Cleveland had very modest WPA swings: +0.264 and –0.370, while Boston had huge marks: +0.999 and –1.605. What happened? Well, Boston kept getting guys on base then not scoring. Since it was a close game, getting guys on caused the Red Sox’s WPA scores to rise, only to have them deflate with every snuffed rally. Eight hits and six walks yielded just one run. They had 14 runners left on base, including five at third base and a quartet at second.

Game length and WPA

One way for a game to boost its WPA score is to go into extra innings. Those games typically do better. In fact, the 100 highest-scoring games all went into extra innings. (The best-scoring regulation game? May 17, 2012: Arizona over Colorado, 9-7, with a mark of 6.018 thanks to a flurry of late-inning action).

The average nine-inning game has a WPA score of 2.475, over 0.200 lower than the overall average. 640 extra-inning games have a typical WPA score more than double that: 5.016. It doesn’t take too long for the numbers to really pile up. Here are the averages per inning.

Games	Innings	Average
1	  5	1.426
3	  6	1.371
7	  7	2.150
8	  8	1.881
6628	  9	2.475
297	 10	4.480
156	 11	4.887
88	 12	5.544
41	 13	6.150
32	 14	6.872
13	15+	7.567

Even at 10 innings, WPA values are well above the nine-inning average. The reasons for that are straightforward. Many regulation games are decided well before the ninth, and so there aren’t many points coming out. However in extra-inning games there are always plenty of points racking up. If the game is close in the ninth, even a routine out can change the WPA score by a decent number, especially if there’s a rally on. So the extra-inning games typically have more points through nine and then get to pile even more on after that.

In fact, the lowest-scoring extra-inning game is actually higher than the nine-inning average. On May 28, 2011, the Twins topped the Angels, 1-0, in 10 innings for a WPA of 2.554. That was actually a fantastic game, with the clubs combining for just three hits in the first nine innings, but pitchers’ duels don’t register with WPA. It prefers some back-and-forth, not a steady-as-it-goes contest.

Run differential

This should be a basic one, but let’s check it out anyway. What’s the average WPA score for a game if it’s decided by one run, or two runs, and so on up into double digits?

DIF	Games	Average
1	2087	3.654
2	1372	2.943
3	1052	2.539
4	 892	2.209
5	 587	2.002
6	 425	1.858
7	 299	1.719
8	 191	1.600
9	 155	1.506
10+	 226	1.420

Yeah, that makes sense. The closer the game, the higher the score, just as you’d suspect. What happens if you take out extra-inning games? After all, we just saw that those games have the highest WPA scores, and clearly most of them will have close final scores.

Things change a bit, but the results remain roughly the same:

DIF	Games	Average
1	1626	3.265
2	1273	2.785
3	1003	2.423
4	 873	2.151

Here’s another way of looking at it. The average score in a nine-inning game decided by one run is 3.266. Well, games decided by two runs have a score that high just one in every four times. Nine-inning games decided by three runs score 3.266 or high 133 times out of 1,033. Barely one of every 14 games decided by four runs are as good as the average one-run game.

Yeah, that all makes sense if you think about it. And it should make sense, otherwise that means there’s something rather wrong with a stat that’s designed to gauge how the game feels.

Final scores

Let’s take the above one step further. We know close games are better than blowouts. Shocker. Now let’s ask this: what’s the most exciting final score for a game to have? What’s better, a 3-2 game, a 5-3 tally, or an 11-8 contest?

Well, the games on file have 152 different final scores, from 1-0 to 16-15. But half of those scores appear 10 times or fewer. Only 27 scores appear at least 100 times. Those account for nearly three-fourth of all the games on file. So how do those final scores rank by WPA? Here are the results (the first two columns give the final scores)

RunsW	RunsL	Games	Avg WPA
7	6	156	4.115
6	5	213	3.998
5	4	339	3.940
4	3	371	3.641
3	2	395	3.386
7	5	123	3.306
2	1	309	3.109
6	4	178	3.080
5	3	254	3.020
4	2	293	2.799
1	0	132	2.731
7	4	135	2.727
6	3	183	2.689
3	1	227	2.489
5	2	196	2.487
8	4	101	2.360
7	3	153	2.289
2	0	130	2.222
4	1	223	2.220
6	2	186	2.160
5	1	182	1.958
7	2	114	1.949
3	0	136	1.945
8	2	104	1.884
6	1	152	1.799
4	0	130	1.736
5	0	104	1.572

(Random side note: apparently the most common final score is 3-2).

Anyhow, the WPA results makes sense. Closer equals better, and higher-scoring contests have more swings than lower-scoring ones, so the highest-scoring close games are at the top.

What’s interesting is that close generally trumps high-scoring. The top five games listed—and six of the top seven—are all one-run decisions. Even the 1-0 game ranks ahead of any game decided by three runs. I find it interesting that there is virtually no difference between a 2-0 game and a 4-1 contest. Ditto with 7-2 and 3-0 final scores.

In general these results make sense. Then again, they’re supposed to, aren’t they?

References & Resources
All WPA info comes from Baseball-Reference.com. I logged in the 2004 games, and Eric Johnson tallied WPA scores for the 2011-12 games. He normally lists the Game of the Day in the Daily Dugout section at Baseball Think Factory.


11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
studes
10 years ago

Nice job, Chris. One question, however.

Every play is a positive WPA total for one team and an equal negative total for another team.  Given this, how can you say whether a game is won or lost more often?  I don’t understand that section.

Chris Jaffe
10 years ago

Studes – I guess a better way of saying it is which team gets more credit from WPA for impacting the game.  Whose plate appearances have the biggest impact – the winning team or the losing team?  By a slight edge, it’s the winning team.

studes
10 years ago

So, when you say “whose plate appearances”, that means you’re looking at offense only? Also, are you looking at absolute WPA, or just positive or negative WPA?

BTW, I wouldn’t trust that “slight margin.”  Setting WPA tables for individual games or even seasons isn’t a perfect science.

Chris Jaffe
10 years ago

Studes, it’s positive and negative WPA.  But looking at the size of the swing, not if it’s plus or minus.  To put in Excel terms, it’s =ABS instead of =SUM.

For example, look at yesterday’s SDP-TOR game:
http://www.baseball-reference.com/boxes/SDN/SDN201306020.shtml

For Toronto, when their hitters did something that improved their chances of winning, the team’s WPA went up +1.453. When they did something that hurt their chances of winning, it adds up to -1.226.  For San Diego hitters, good plays had a WPA of +0.736 and bad plays had a WPA of -1.008.

Overall, WPA swings in that game were 4.423: 2.679 when Toronto batted and 1.744 when San Diego batted.

Shane Tourtellotte
10 years ago

Chris, another thing I might look at is absolute WPA for home and road teams.  I don’t know whether there is more potential for one or the other to rack up more WPA movement—though if I had to guess I’d say the home team—but I think it’s worth checking.

One would have to watch out for a couple confounding factors.  Winning home teams not playing the ninth would be one.  Winning teams (which you’ve shown generate a bit more absolute WPA) tending to be home teams (home field advantage) would be another.  How much correlation and causation is involved in that connection, I couldn’t guess yet.  Not without data.

studes
10 years ago

Thanks, Chris.  That’s what I figured.  But why look at only offense?

MikeS
10 years ago

I don’t know if you can say that WPA tells you that slightly more games are won by good play than lost by bad play since WPA doesn’t take defense into account.  I think a hitter gets credit for something positive when a fielder makes an error, and definitely if the fielder makes a bad defensive play, say taking a bad route to a fly ball, turning a potential out into a double.

That said, WPA doesn’t give defensive credit for good plays either and it may be reasonable to assume that a roughly equal number of games are won by good defense as lost by bad defense, but you are guessing with that assumption.

Jim
10 years ago

First thing that has to be considered is what is an exciting game?

This one http://www.retrosheet.org/boxesetc/1996/B06300COL1996.htm#?  No, not to me.

This one is a lot more exciting http://www.retrosheet.org/boxesetc/1956/B10080NYA1956.htm.

I saw them both and the first one made me go mow the grass.  I still get anxious on the second one and I know the outcome!

I believe a great game would be my team visiting and on the first pitch the leadoff batter hits a homerun and both pitchers then throw nine perfect innings after that.  I’m sure WPA wouldn’t give much weight to the homerun, nor to 54 straight outs until the very end of the game.  I don’t know, because I have not found an easy way to determine the changes each pitch makes WPA.

The idea sounds fascinating, but it doesn’t care about pitching, let alone defense.  But, to me each pitch would mean something and be very exciting.

I’m sure someone else would have a different opinion.  Like you said in the first sentence, WPA is a fun stat, but not worth much more.

Chris Jaffe
10 years ago

Studes – not sure what you mean. If I looked at pitching, it would just be the inverse of what I have.  Besides, B-ref’s game logs break it down into + and – WPA for hitters, but just list cumulative WPA for pitchers (and like I said, pitcher WPA is just the inverse of batter WPA for the opposing team). 

Jim – that’s a good point.  WPA definately has its own ideas of what exciting is—but often that does reflect what people find memorable in a game.  Late inning drama is an element in virtually every remembered game, and it accounts for big WPA scores.

studes
10 years ago

Chris, by looking at events from the offensive perspective, you associate those events with the offensive team.  The winner or loser, however the game plays out.

However, if you associate those events with the defensive team, you will get the opposite answer, right? Take the Red Sox/Indians game. You associate the big swings in WPA to the Sox because they were at bat, but what if you associated them with the defense instead? You’d say the Indians had the big swings in WPA.

I’m suggesting you can’t say whether teams “win or lose games” because you can’t say which team (offensive or defensive) was responsible for each swing in WPA. Your data is structured one way, but WPA is a reflection on both teams at the same time.

studes
10 years ago

By the way, regarding Jim’s point, you can use Leverage Index to rate games, too.  I combined WPA and LI to come up with this list of exciting games in 2012:

http://www.hardballtimes.com/main/article/must-see-mlbtv/