Lovers of great pitching are living in a golden age. The last six years, 2007 to 2012, have all had multiple no-hitters, including the first post-season no-no since Don Larsen. Even more striking has been the explosion of perfect games. In three and a half seasons, we have witnessed five of those gems (plus a sixth that was maddeningly close). The first 133 years of organized baseball produced 17 perfect games, roughly one every eight years. Right now, we’re getting them more than once a season.
Students of the game are asking why. The turning of the wheel toward a lower-scoring game is part of the equation, as is the greater number of games played in the major leagues each season, almost twice the number of 1960 and before. These seem inadequate to explain all of the current surge, though. More than one person has calculated the historic odds of an average pitcher getting 27 outs from 27 batters, multiplied it by all the games played, and concluded that things have gone haywire.
I did not aim for such a big conclusion. My curiosity started more modestly: I wanted to calculate how likely it was that the pitchers who did throw perfect games would have done so. Given their performance in the years they accomplished the deed, and throughout their careers, what were the underlying chances that they would get themselves into the record books this way?
The grungy details
The numbers are straightforward: how many batters the pitchers faced, how many batting outs they produced, then a lot of exponents. One needs to discount intentional walks and sacrifice bunts from the equations. The latter literally cannot happen in a situation where a perfect game is possible; the former practically never happens in such a situation. Thus, I do not count intentional passes as plate appearances, or sacrifice bunts as plate appearances or outs. Sacrifice flies, being glorified fly balls, count as plate appearances and outs.
I also counted post-season appearances in the year and career numbers. Larsen’s perfect game came in the World Series, and Roy Halladay came one walk away from a playoff perfecto in 2010, “settling” for a no-hitter. Granted, there’s a heightened degree of difficulty when playing against a team that’s made the World Series (or earlier rounds), but they are games, and it’s not like they count less than the regular season.
For much of baseball history, all this works out fine. For five of the 22 perfect games, though, some numbers are missing from the records. Intentional walks were only regularly counted starting in 1955, and players reaching on errors were tracked from 1948 on. So for Lee Richmond and Monte Ward in 1880, Cy Young in 1904, Addie Joss in 1908, and Charlie Robertson in 1922, I had to do some guesstimating.
Even at a site like this, going into great detail would strain readers’ patience. I will note one peculiarity I discovered while doing this work. In 1948, the first year reached on error was tabulated, 33.3 percent of recorded errors resulted in a batter reaching base. The next year, this jumped to 40.8 percent, then 48.8 percent, then 52.5 percent vin 1951, and creeping upward from there. (The 2011 figure was 58.4 percent) I doubt that actual fielding competence, or scorers’ rulings, changed this rapidly. It appears that, despite lack of notation in many split lines, the ROE totals are incomplete for the first several years that the stat was tracked. Future researchers, take note.
The table below shows seasonal statistics for all 22 pitchers to throw perfect games. Season stats for Philip Humber and Matt Cain are complete through July 1, and must be adjusted accordingly to compare them with full-season numbers.
Pitcher Starts PA(Adj'd) Reached Outs/PA Prob-Game Prob-Season Lee Richmond 1880 66 2410 750 0.689 0.00425% 0.280% Monte Ward 1880 67 2351 678 0.712 0.0102% 0.684% Cy Young 1904 41 1433 394 0.725 0.0170% 0.694% Addie Joss 1908 35 1208 295 0.756 0.0521% 1.808% Charlie Robertson 1922 34 1118 400 0.642 0.000642% 0.0218% Don Larsen 1956 22 795 254 0.681 0.00306% 0.0674% Jim Bunning 1964 39 1123 322 0.713 0.0109% 0.425% Sandy Koufax 1965 44 1366 325 0.762 0.0652% 2.827% Catfish Hunter 1968 34 949 286 0.699 0.00623% 0.212% Len Barker 1981 22 654 207 0.683 0.00345% 0.0759% Mike Witt 1984 34 1022 326 0.681 0.00313% 0.106% Tom Browning 1988 36 992 282 0.716 0.0120% 0.430% Dennis Martinez 1991 31 895 265 0.704 0.00764% 0.237% Kenny Rogers 1994 24 710 236 0.668 0.00183% 0.0439% David Wells 1998 34 967 263 0.728 0.0190% 0.643% David Cone 1999 33 877 287 0.673 0.00224% 0.0742% Randy Johnson 2004 35 956 242 0.747 0.0378% 1.315% Mark Buehrle 2009 33 860 276 0.679 0.00290% 0.0955% Dallas Braden 2010 30 775 232 0.701 0.00674% 0.202% Roy Halladay 2010 36 1066 294 0.724 0.0165% 0.591% Philip Humber 2012 12 299 103 0.656 0.00112% 0.0134% Matt Cain 2012 16 443 115 0.740 0.0299% 0.477%
There are only three pitchers showing a better than 1 percent chance to throw a perfect game the season they did it: Addie Joss, Sandy Koufax, and Randy Johnson. (Matt Cain might get there with a hot second half and/or some playoff starts.) Koufax’s number exceeds the rest through a convergence of factors: he had a great season with a lot of starts in a historically low run environment. Randy Johnson had a better ERA+ in his perfect-game year than Koufax (176 vs. 160), but he was pitching in a five-man rotation at the tail end of an offensive explosion. Joss’ ERA+ outstripped both, and he pitched deep in a deadball era, but a higher error rate depressed his chances.
Six pitchers had less than a 0.1 percent chance of making history that season. Larsen, Barker, and Rogers had the fewest starts of anyone on the list (except this year’s entries so far), helping explain their long odds. Cone pitched well, but yielded lots of walks. Buehrle has a reputation for getting results better than his underlying numbers, which certainly happened one day in 2009.
As for Charlie Robertson, he comes across as the biggest fluke on the list. He didn’t pitch badly that year, with a 111 ERA+, but much of that value came in suppressing extra-base hits during the early stretch of the Ruthian bat-boom. His OBP-against at .346 was just better than the suddenly inflated league average. Perfect games were next to impossible in that offensive flood, and the relatively anonymous name of the man who managed it just highlights the unlikeliness.
The perfect game from the worst pitcher for a season has a fairly unexpected source: Catfish Hunter. Hunter, still only 22, was down in ’68 from his first truly good season, and posted a mere 83 OPS+. It was still uncertain then whether Catfish was a young star in the making or just a kid who lucked into a good ERA in ’67. Getting his perfecto in the Year of the Pitcher was a mark in the “lucky break” column, but he’d provide some countervailing evidence in seasons to come.
Philip Humber’s 2012 ERA+ through July 1 is, at 71, even worse than Catfish’s, and his perfect-game chances are worst on the list (though for only half a season: he has time to pass Robertson). As I write, he’s on the disabled list with a right elbow flexor strain. One can wonder whether the effort of producing a perfect game could have caused the injury, or at least a lesser strain that left him ineffective and grew into outright injury. To answer, I note that in his first four games this year, he threw 115, 96, 115, and 107 pitches. His perfect game was the 96. He wasn’t busting any pitch counts to get in the record books, so we probably should look elsewhere to explain his woes.
Just for some fun, I ran the numbers on three other pitchers who threw unofficial perfect games: Harvey Haddix in 1959, Pedro Martinez in 1995, and Armando Galarraga in 2010. The odds are still for throwing nine perfect innings, even though one did more and another did arguably less (or arguably more).
Pitcher Starts PA(Adj'd) Reached Outs/PA Prob-Game Prob-Season Haddix 1959 29 875 240 0.726 0.0174% 0.5035% Martinez 1995 30 776 237 0.695 0.00533% 0.1597% Galarraga 2011 24 615 201 0.673 0.00229% 0.0549%
Haddix had a fine 1959 when he pitched those twelve perfect innings. Odds-wise, he’s solidly in the upper half of our exclusive group. Martinez’s low chances seem surprising, but it would be like Koufax tossing a perfect game in 1960, or Hunter in 1968. It was a couple years before he started putting up those mind-bending numbers. Galarraga’s odds are no shock: one could argue he’s the flukiest pitcher on the list, without even being on the official list.
Career numbers, and why they’re unreliable
I have also calculated the probabilities for our perfect pitchers to have done the deed throughout their careers. It is here that I discovered an unreliability in the numbers—but an unreliability that turns out to be informative, giving some perspective on our current perfect-game boom. I will explain it soon, but for now, I must echo a myriad of sports betting-line columns and state that the following numbers are for entertainment purposes only. Again, stats for contemporary pitchers are complete through July 1.
Pitcher Starts PA(Adj'd) Reached Outs/PA Prob-Game Prob-Career Richmond 179 6819 2610 0.617 0.000220% 0.0394% Ward 262 10162 2956 0.709 0.00932% 2.412% Young 818 29678 9277 0.687 0.00392% 3.156% Joss 260 8880 2530 0.715 0.0117% 2.993% Robertson 141 4265 1592 0.627 0.000332% 0.0468% Larsen 177 6718 2261 0.663 0.00154% 0.273% Bunning 519 15341 4673 0.695 0.00550% 2.813% Koufax 321 9576 2708 0.717 0.0127% 3.983% Hunter 495 14368 4199 0.708 0.00885% 4.285% Barker 194 5590 1877 0.664 0.00159% 0.309% Witt 301 8906 2923 0.672 0.00216% 0.649% Browning 303 7990 2534 0.683 0.00336% 1.014% Martinez 569 16740 5364 0.680 0.00295% 1.666% Rogers 482 14317 4951 0.654 0.00106% 0.508% Wells 506 14745 4674 0.683 0.00338% 1.698% Cone 437 12543 3959 0.684 0.00357% 1.548% Johnson 619 17385 5325 0.694 0.00515% 3.137% Buehrle 385 10745 3465 0.678 0.00272% 1.043% Braden 79 2056 673 0.673 0.00224% 0.177% Halladay 368 10706 3211 0.700 0.00659% 2.396% Humber 40 1200 385 0.679 0.00291% 0.116% Cain 222 5895 1750 0.703 0.00741% 1.633%
While producing this table, I briefly had the post-season numbers for our contemporary pitchers isolated, and got an eye-opener. In four career post-season starts, Mark Buehrle works out as having had a 0.0542 percent chance of a perfect game (on 34 batters reaching in 121 plate appearances, both numbers duly adjusted). This exceeds Charlie Robertson’s perfect game chances (0.0468 percent) for his entire career of 141 starts. Matt Cain outdoes them both, working up a 0.1139 percent chance in only three post-season starts. Even with the unsteadiness of the numbers, this illuminates just how unlikely it was for a journeyman pitcher in the Era of the Babe to do what Robertson did.
I was surprised to find Hunter edging out Koufax for the highest probability. Half again as many starts, in a similarly low-offense era, helped Catfish plenty. Lee Richmond ends up with even a worse career probability than Robertson does, and with 38 more starts. It’s the errors (and maybe my projections on errors for 1880) producing this result, so you can take this with a grain of salt. This is not, however, the big problem I was talking about.
So what is that problem? You’ll get a hint of it if you look at Sandy Koufax’s season and career numbers. He’s 3.98 percent for his whole career, and 2.83 percent for 1965 alone. Was Sandy that much less likely to have thrown perfect games in his other seasons? I checked: he wasn’t. He was at 1.26 percent in 1966. Perfect game chances are multiplicative, not additive, but that still comes out as 4.05 percent for the years 1965 and 1966, compared to 3.98 percent for 1955 to 1966.
This is, of course, impossible. What’s going on here?
The hitch is that these collective numbers represent a wide diversity of individual days and situations. On some days, the pitcher will be loose and ready; the fastballs will crackle; the breaking balls will snap; the catcher won’t need to move his glove; the wind will be blowing in at Dodger Stadium. Other days, he’ll have a cold or a tight shoulder or a twinge in his elbow; he’ll never find top gear with the fastball; the curve will be limp; his precision will be gone; he’ll be pitching at Coors Field, or Wrigley with a hot wind blowing out. There are good days and bad days, and while you can average them, they don’t average out where perfect games are concerned.
Take, for example, a pitcher good enough to get outs on 70 percent of his batters faced. That is a mean figure: he’s not pitching at 70 percent efficiency every day. If the variance of his underlying efficiency is large enough, it can produce huge effects on his perfect-game chances. Say he has two lousy starts at 60 percent, followed by a locked-in day at 90 percent. The 90 percent day is good enough to produce a 5.81 percent chance of a perfect game, whereas three games at 70 percent would only rate a 0.0197 percent chance.
That’s an extreme example, but the principle holds even with much smaller variances. Staying with our 70 percent pitcher, let’s postulate one day where he’s working at a 23 outs/32 batters rate (71.9 percent) and two where he’s at 20/29 (69.0 percent). Add the three days together, and he’s at 63/90, or 70 percent on the dot. (I’m assuming he gets to face more batters on days he’s pitching better.) With no variance, his odds are, as stated above, 0.0197 percent. With the slight variance, it rises to 0.0222 percent.
And it’s not only due to the good start having a greater variance from the mean than the lesser ones. Switch the numbers, make it two 22/31 (71.0 percent) days and a 19/28 (67.9 percent) day. The collective perfect-game chance is 0.0219 percent, still significantly ahead of the 0.0197 percent for the no-variance case. The conclusion is plain: any variance from the average produces gains in perfect-game chances from the “on” days that more than offset losses from the “off” days.
Koufax recapitulates that on a career level. He had several years when we was an okay pitcher, followed by several years when he was The Left Arm of God. Taking a 12-year average of his performance dilutes how dominant he was in those glory years, and crashes the math on how likely he was to put together a perfect game. The same thing happens year to year, month to month, day to day.
We have no mechanism by which to gauge how good a pitcher truly is from one day to another. Hits and runs, walks and strikeouts do tell a valid story, but they are fluctuations around an unseen baseline. We can’t measure that daily baseline yet, and though the emerging science of baseball is making inroads on a dozen different levels, I don’t think we will ever have it nailed down.
Breaking down the numbers into ever-smaller groups to chase a more precise answer brings on a reductio ad absurdum. Just because a pitcher retired 27 straight batters one day doesn’t mean he was a 100 percent lock that day to pitch a perfect game. It might have been as high as a 1-2 percent chance; it might have been one in a million.
What is pretty sure is that the chances are better than they appear from taking averages for a season, a career, or the whole history of baseball. The tables above are useful for comparison between the pitchers, but they end up representing a floor on the true range of probabilities. Perfect games turn out to be more likely than straightforward calculations make them out to be, and the current surge we are witnessing is not as big a warping of the percentages as some have concluded.
Might we figure out some day what the percentages really are? It would mean figuring out the true shape of the bell curve of pitchers’ performances, how the curve bends differently for aces, journeymen, and fringe players, how far and how often they stray from their average talent level. Even an approximation would be a weighty undertaking, a mountain too steep for me to climb.
But someone else might. People do keep beating the odds, and more often than you think.
References & Resources