The Oakland A’s are, almost by acclamation, the sabermetrics community’s favorite baseball team. The reason why is a bit of egoism masquerading as objectivity, or maybe the other way around. The A’s, led by general manager Billy Beane, use their brains. They strain to find ways to buy wins cheaply, which is almost the only way they can afford to buy them. This led them to embrace, and in large measure validate, sabermetric findings that thinkers about baseball had been shouting into the void for years.
They aren’t the only baseball team to do this now. Most teams today have analytics departments, to make them look smart if not always to use what they produce. In innovation, Oakland has probably been leapfrogged by the Tampa Bay Rays, whose owners are building on the A’s foundation. But if they are still second-best in the eyes of sabermetricians, the reason why can be summed up in one word: Moneyball.
Best-selling books help. Brad Pitt movies help more. Even when the Rays got their own book, Jonah Keri’s The Extra 2%, their smarts worked against them. Michael Lewis revealed all in Moneyball, which helped other teams ape Beane’s methods and thus make them less effective for him. The brain trust of the Rays let us see the method and overall philosophy, but jealously guarded the details to keep their edge. Shrewd, but not calculated to win the hearts and minds of analysts who drink in those details.
So Beane and the A’s remain the first true love of us baseball nerds. Over the last 14 full seasons, Oakland has made seven playoff appearances, far more than we’d expect from a team as thoroughly outspent as the A’s have been. It’s a heart-warming, brain-satisfying success story—except for one little detail.
Beane said it best in a certain noteworthy book: “My [methodology] doesn’t work in the playoffs.” No, he didn’t say “methodology.” Yes, I’m being a schoolmarmish prude for not spelling out the much shorter word he did use. If you’re offended by my not being offensive, consider the irony in that. Savor it. We’re so overloaded with irony these days, it’s been devalued. We don’t appreciate it properly.
Back on my long-abandoned point, seven times Beane’s A’s have made the playoffs, and six times they have washed out in their first round. The lone exception was 2006, when they stunned the world by sweeping the Minnesota Twins, then restored balance to the world by getting swept out of the ALCS by the Detroit Tigers. As much success as Beane has garnered between April and September, he’s 1-7 in playoff series, and has yet to receive the validation of an American League pennant, never mind a World Series victory.
People have asked before why this should be. Back in 2006, in Baseball Prospectus’ Baseball Between the Numbers, Nate Silver and Dayn Perry studied correlations between playoff success and a host of team statistics, from batting average and isolated power to playoff experience and team record in September and onward. The three factors they found tracked best with playoff success were pitcher strikeout rate, fielding runs, and closer win expectation. (They used measures for fielding and closer performance, FRAA and WXRL, that are obsolete a whole eight years later, but their findings stand.)
“It would be misleading,” they warned, “to suggest that this is some kind of secret sauce.” They rethought this, and for a while you could look up teams’ “Secret Sauce” rankings at the BP website. Then they re-rethought this, possibly after added data weakened the correlations, and took the sauce off the menu. Back to Square One we went.
I am not going to plow through stacks of peripheral stats to find the hidden cause of Oakland’s woes. Instead, I am going to plow through stacks of one particular stat, not all that peripheral, to see if it predicts postseason success better than we manage it right now. My hunt is based on an obvious yet over-lookable principle: baseball playoffs, with one teensy-weensy, historically bizarre and hopefully never to be repeated exception, involve winning teams.
(That exception was the 1981 Kansas City Royals. In that strike-riven season, they won the second-half semi-crown of the American League West despite an overall 50-53 mark. The 2005 Padres slipped in as NL West champions at 82-80, the closest any other team has gotten to reaching the playoffs without a winning record.)
The skill set that lets you fatten up against also-rans may not be the same one that lets you hang in there with the first rank of the league. A playoff team that’s gotten there mainly by hammering the hapless could be in trouble because there are no punching bags left on the schedule. This might explain why some teams, such as Oakland, fizzle out.
There’s some precedent for this kind of analysis. Not too long ago, Vince Gennaro, president of the Society for American Baseball Research, broke down hitters by how they perform against strong pitchers and weak pitchers. The difference he found in league-wide performance against those two “buckets” of pitchers was about 180 points of OPS, but there were some batters with much wider splits, and others with much narrower ones.
In an appearance on MLB Network’s Clubhouse Confidential, he gave examples from both groups: Josh Hamilton and Derek Jeter. Hamilton feasts on poorer pitchers while struggling against the aces; Jeter doesn’t run up his numbers against the staff filler, but holds up well when facing the studs. Jeter certainly doesn’t reverse the splits, hitting better against good pitchers, but he narrows them a good deal, while Hamilton’s split is quite wide.
How this connects to postseason affairs is that good teams, the ones making the playoffs, tend to have good pitchers (that being part of what makes them good). Furthermore, with pacing concerns for a long season no longer paramount, those good pitchers will get an increased proportion of innings pitched. The dropping of a team’s fifth starter from the playoff rotation is a familiar example of this. Gennaro estimates that pitchers who throw the top 40 percent of regular-season innings pitch two-thirds of postseason innings.
With the playoffs featuring an over-sampling of good pitchers, batters with a narrow split between top and bottom hurlers would naturally get an edge, while those with wide splits would be facing a steep climb. The postseason numbers for Hamilton and Jeter show how this plays out.
|Triple-Slash Line Comparison|
(Career records through 5/30/2014)
Gennaro might have been cherry-picking his examples for effect—Jeter matching or outpacing his regular-season stats frankly isn’t normal—but the principle holds. The postseason is a good time for those who are stout against strong competition, and what holds for individuals ought to hold for entire teams. That’s the assumption I’m testing.
I gathered the records compiled against winning teams by every playoff and tiebreaker participant in major league history since 1903 and the first modern World Series. The records I found at Baseball-Reference combined teams with winning and even records: in a fit of pickiness, I combed out the games against .500 clubs.
Tiebreakers, from the 1946 NL playoff to last year’s Cleveland-Tampa Bay showdown for the AL’s second Wild Card berth, are technically considered part of the regular season. I include them here for a little added data, and because those games, by nature, do also pit winning teams against one another. This means I had to take care to remove the results of those tie-breaking games for the teams I was studying, at least for the tiebreaker. They went back in for regular postseason games.
I did leave the 1981 Royals’ playoff series in the data, even though it defies the assumption on which I’ve predicated the study. Their playoff opponent—ironically, the Oakland A’s—had them dominated in the record splits, whichever way you slice them, and swept them out of the postseason. The effect on the findings is slight.
I’ll start with a quick look at the case that got the ball rolling.
Getting a statistically robust finding from the results of 37 baseball games is scraping close to being a fool’s errand. But that’s what I have attempted with the Oakland A’s. If their record against winners is significantly worse than those of its playoff opponents, we’ll have our first piece of evidence that this may be a better way to judge postseason chances.
We do not have that first piece.
Oakland’s recent postseason record has had a remarkable consistency. In six out of seven years, they went the distance in the ALDS before losing the fifth and deciding game. Add in the sweep-and-swept 2006, and they are 15-22 in that stretch. Teasing out a pattern of success and failure from events that consistent would be nearly impossible in ideal conditions. In this case, we may delete “nearly.”
In their eight playoff series of the 2000s, the A’s have had the better overall record six times, and the better record against winning teams five times. No noteworthy difference exists there. It is interesting, if trivial, that in those years Oakland has never had either the best or the worst winning-versus-winners mark. Five of seven years they’ve been second, the other two third. In overall record, they were second six of seven times, the other being fourth and last. Ironically, that was 2006, when they actually won a round.
Going by composite records rather than rankings, the A’s had a .534 record against winning teams in those playoff years, going against teams with a composite .512 mark against over-.500 opponents. By overall records, it was the A’s at .596 and their opponents at .566. The margin is somewhat reduced for the over-.500 games, but they still outperformed their adversaries, and gave us little reason to expect the near-whitewash they suffered.
(Don’t be too surprised by the seemingly low percentages against winners. It’s a tautology, but that group is tough to beat, which is what makes them winners. From 2000 to 2013, the whole AL’s record against winning teams was .445. A .534 mark for the A’s playoff teams is in line with expectations.)
So my original hypothesis receives a decided negative: the idea doesn’t explain the A’s underperformance at all. Perhaps, though, it may hold over a much bigger sample size, for all teams with superior records against good opponents. Let’s give it a look.
A first look at postseason/tiebreaker series throughout history restores some life to my hypothesis. Whether it’s a superficial life, or the kind that matters most, is an intriguingly debatable point, which we’ll get to soon enough.
There have been 295 postseason and tiebreaker series, including single-game knockouts such as the 1948 AL playoff and the current Wild Card games. Of those, 22 have matched teams with identical won-loss records (as would always be the case with tiebreakers). Just three have pitted teams against each other holding equal marks versus winning clubs.
In series with differing overall records, the team with a better record has won 147 times and lost 126, for a mark of .538. In the case versus winning teams, the more successful club has taken 161 series and lost 131, for a percentage of .551. This isn’t a definitive result, as the margin and the sample size are not large enough in combination for that type of confidence. It does point, though, toward one’s record against winners being a better indicator of postseason prospects.
If we want more data points, we can break the series down into individual games. For overall performance, the team with the better record has won 735 postseason and tiebreaker games, and lost 628. The .539 percentage is almost identical to that for series. For success against winning teams, the better club has won 747 games and lost 663, for a .530 percentage. That is not only notably below the record for series, it now underperforms the mark dealing with overall records.
What just happened? Performance against winning teams is the better playoff predictor if we go by series wins, but worse if we go by individual games. (The latter result is not conclusive, either, at the usual 95 percent confidence level.) Which is the better indicator, winning the battles or winning the wars?
Arguably, this is a classic clash between process and result. Just as the process of producing and preventing runs leads, if imperfectly, to the overarching goal of winning games, so does the process of winning playoff games lead to the result of winning playoff series. (This is suspended in the case of single-game tiebreakers and Wild Card playoffs.) It’s not quite the same—you can have a winning record with a negative run ratio, while three blowouts in the World Series still loses to four nailbiters—but it serves our purposes.
I could have dug deeper and calculated the runs scored and allowed in all those playoff games, but a sudden fit of sanity prevented me. I’ll try not to let it happen again.
Like any good sabermetrician, I have to come down, however reluctantly, on the side of process. Series victories are the goal, but the sheer number of games makes them a more trustworthy measure of long-term success. Of course, neither result was conclusive, and if we split the difference, it comes out to performance against winning teams being really no different than performance overall in forecasting a winner.
There is, however, another level deeper to go (without counting up all the runs scored—I’m still in the grip of sanity there). My analysis so far has just checkmarked which team has the better record, and treated each instance as equal. But sometimes the margin is paper-thin, and sometimes it’s a canyon. Might it be that, taking the magnitude of the separation into effect, we could reach a different conclusion?
This requires using the log-5 formula, invented by none other than Bill James to estimate the chances of victory between two teams of known winning percentages. If A and B are the winning percentages of the teams, Team A’s chance of winning is given as:
(You saw a different version of this formula in Steve Staude’s recent article about his tool for calculating game and series win probabilities. They look different, but are mathematically equivalent.)
There is one pitfall in the log-5 estimates: they don’t take home-field advantage into account. For much of baseball playoff history, this doesn’t matter so much. Home field was alternated between the American and National Leagues, and later between the Eastern and Western Divisions. In today’s playoff structure, pre-World Series match-ups have home field awarded to the team with the better record, though earlier in the Wild Card era one could see a division winner with an inferior record get the home edge over a better wild card winner.
This means that, especially for the series of the last two decades that by sheer inflation match all those that went before, the playoff performances of teams with superior overall records should mildly outperform the log-5 estimate. We will keep this in mind when looking at those numbers.
From 1903 through 2013, the average overall winning percentage for teams with the superior record in their match-ups was .621; for the inferior team, it was .578. Putting this through the log-5 formula (noting that I did this with un-rounded numbers having plenty more decimal places), the better teams were expected to win 54.44 percent of those games. As noted earlier, they actually won 53.93 percent. The teams with better overall records undershot the log-5 estimate, even without adding the occasional home-field edge.
Doing this for records against over-.500 opponents, the better team averaged a .573 mark against winners, and the worse posted a .511. The log-5 formula gives the leaders an expected winning rate of 56.22 percent, well over the 52.98 percent they achieved in the actual playoff games. Also, there is no automatic home-field advantage ever given out for being better against winning clubs. There’s substantial overlap with an overall winning mark, so some of the advantages would carry over, but less so. There’s no closing the gap that way.
But perhaps I went about this the wrong way. I lumped all the better and poorer teams together to get collective percentages and a collective log-5, when maybe I should have done it individually for each playoff round, and combined those results. So I re-did it that way.
|Measure of Better Team||Proj. W%-Coll.||Proj. W%-Indiv.||Actual W%|
It didn’t matter. The log-5 projections budged at the fourth decimal place, and that was all. The relative underperformance of powerhouse-beaters actually widened as measured against that of overall favorites. Overall, teams with superior overall records underperformed their log-5 estimates by about half a percentage point. Teams with superior records against winning clubs undershot the log-5 numbers by three and a quarter percentage points.
Measured by itself, the real-life performance versus winning teams falls well over two standard deviations away from the projections, passing the 98 percent confidence interval. And this is without factoring in the slight home-field effect that would probably widen the gap. The evidence is that, not only is strong performance against good teams not an advantage in the playoffs, but shockingly it appears to be a drag on a team’s prospects, with a margin highly unlikely to be due to chance.
I could have accepted a lack of advantage with equanimity. I’m accustomed by now to my theories getting shot down by the evidence. A result this counter-intuitive, however—doing well against good teams being an absolute disadvantage when facing opponents selected for being good teams—is throwing me down a rabbit hole.
Before I meet any hookah-smoking caterpillars or have playing-card royalty call for my head, I’ll look at the numbers another way to try to start wriggling free of this paradox. I broke out the results for the years 1903 to 1993, which both gives us the era when records never had an effect on home-field advantage and also roughly divides the total number of games in half. I went with a collective log-5, because it’s quicker and obviously makes very little difference.
|Measure of Better Team||Projected W%||Actual W%|
The fortunes of teams with better overall records nearly match the log-5 predictions. The gap for the better teams against winners has closed, enough so that for this smaller sample it falls well short of even a 90 percent confidence interval. All this means, however, is that the underperformance of projections over the last two decades, the ones obviously most germane to what we can expect to see in the future, has been that much bigger.
Welcome to Wonderland. Or maybe Bizarro World.
What the &%$# Do We Make of This?
I didn’t expect to be here. I was hoping to find the flaw in the Oakland A’s that made some sense of their lousy postseason performance in the Beane Era. I was expecting that ability against good teams wouldn’t make enough difference to explain much of anything. Instead, what I discovered flies in the face of logic—though it turns out it might point toward the A’s situation after all. They’re generally better against good teams than their playoff opponents, and they almost always lose. Their experience fits with the inside-out logic.
I am forced to fall back on what scattered learning about logic I have: if the logic is sound but the conclusion is nonsensical, check your premises. My foundational premise was that excellent play against winning teams from April through September would carry over into October. If that’s not so, then what about playoff baseball makes it different from the 162 games that preceded it?
One answer, at least more recently, is that pre-World Series playoff rounds give out home-field advantage (usually) to the team that finished with a better record. This doesn’t have nearly the effect regarding records against winners than it might. In the 273 postseason series where one team had a better overall record, 185 or 67.8 percent also had the superior record against teams over .500.
The problem is that even this attenuated home-field advantage for the contender-beaters should be raising their postseason record, when instead it falls well below expectations. Worse, the numbers took a tumble from pre-1994 to post-1994, the point at which home-field kicked in and we should have seen an uptick. So the paradox only deepens.
What other differences are there in playoff baseball? The cliche about baseball’s long season is that it’s a marathon, not a sprint. In the postseason, it is a sprint, or a string of them if you win early on. I’ve noted already that pitching staffs get handled differently come the postseason. This happens to position players, too: they, especially catchers, don’t get benched for a day to rest in the middle of a playoff series.
Is it possible that this could be a key? If you postulate that the lesser team in the match-up has an inferior record due to a lousier bench, back end of the rotation, etc. rather than for weaker front-line players, and if you speculate that it was the winning teams that were able to exploit the performance of the scrubs more effectively, then their absence might swing the odds. Those are, however, pretty big ifs.
Could it be a matter of attitude instead? The teams with robust records against winners could be complacent, while those struggling might have the sense to realize they’re on the wrong end of things, and look to make changes to give themselves a better shot. Baseball does have a way of leveling things, call it regression or what you will, and the opening of the postseason is a natural inflection point for teams to re-evaluate themselves.
Why this regression would manifest itself with records against winning teams much more than with full season records is a difficult matter. It would imply that teams are conscious of how they do against good opponents. It’s a fairly esoteric matter for fans; would they as players, and managers and coaches, see it differently?
I am already straining to come up with these theories. Any others I devised might well stretch credulity to the snapping point. Goodness knows that the theories other baseball commentators have offered for the original case haven’t exactly warmed sabermetricians’ hearts.
The accusation leveled against the A’s more than a decade ago was that playoff baseball was fundamentally different, requiring stratagems to manufacture runs rather than playing station-to-station ball, and that Beane’s team not only could not adapt but refused to. Adherents of baseball analytics rejected this as Neanderthal thinking, but the revival of the A’s fortunes have also brought the same familiar results in the ALDS.
One almost throws up one’s hands and says it has to be something psychological. Possibly it is: it’s humans playing these games, and humans are way more complex than mathematical equations for offensive production. But it’s unquantifiable, irreducible, effectively a black box. It may be that this comes from something we cannot yet understand, but such an explanation cannot be a satisfactory one.
So I will not force the issue. The result is what it is: teams with superior records against winning ball clubs have that strength crumble once the postseason comes along. As for the reasons, your guesses are as good as mine. Maybe even better.