So, Billy, What Does Work In the Playoffs?

by Vinay Kumar
May 11, 2004

“My s*** doesn’t work in the playoffs.”
— Billy Beane, Oakland Athletics GM, in Michael Lewis’ Moneyball

Those words are some of the most famous uttered by a general manager in the last decade. Many have repeated Beane’s words or paraphrases thereof, often to mean different things. Statheads see Beane recognizing that anything can happen in a short series (in fact, the foul-mouthed Beane continued, “My job is to get us to the playoffs. Everything after that is f****** luck”). Critics of sabermetrics interpret Beane’s statement as an admission that sabermetrics don’t apply to the post-season (which is a misunderstanding itself, since “sabermetrics” is defined as “the objective study of baseball,” and not any specific principles).

Many people have theorized what kind of teams win in the playoffs. Since teams skip their fifth and sometimes fourth starters in the playoffs (giving more starts and innings to the top starters), it makes sense that front-line starting pitching is more important than depth. While many people have long understood that idea, the 2001 Diamondbacks drove the point home.

With the Athletics’ well-chronicled four-year streak of losing in the first round, many pundits have talked about the importance of “small ball” and manufacturing runs. Many of these theories make sense — in theory. But I don’t know whether any data supports them; most of the time, people point to single instances to support their ideas, like the 2002 Angels or 2003 Marlins.

I’ll ignore the holy wars and agendas while trying to answer the obvious follow-up question: what has worked in the playoffs? What are the traits of teams that have been successful in the playoffs?

Major League Baseball moved to an expanded playoff system in 1995 (well, they planned to move to the system in 1994, but the strike caused the cancellation of the 1994 playoffs). Offensive levels jumped in 1993 and rose again for the next few years; the brand of baseball we’ve watched for the last decade differs noticeably from that of the previous few decades (in particular, there have been more long balls and strikeouts, and relief pitching roles have become increasingly specialized).

There has been much more talk during this past decade about baseball’s finances and supposed competitive imbalance. Combine all these factors, and it makes sense to look at the playoffs since 1995 only, as the results from 1993 and earlier aren’t necessarily relevant to today’s post-season.

So which traits do translate into post-season success? I looked at all 63 post-season series that have been played since 1995 and looked at which team fared better in various statistical categories over the regular season, compared with which team won the series. For instance, the team with more regular season home runs has won 32 of the series, while losing 31 times — a virtual dead heat.

While it may be tempting to infer that home run hitting is not important to playoff success, that isn’t a fair reading of the data. Sixty-three series represents a lot of baseball, but still constitutes a small sample statistically. Furthermore, there are times when the “underdog” in the category isn’t exactly a pipsqueak; the Yankees hit 230 HR last year, but counted as one of the low-HR upsets when they knocked off the 238-HR Red Sox. So I don’t think any meaningful conclusions can be drawn from such a comparison; I think this exercise is useful just to add some actual data to the discussion and to see which arguments don’t even have a leg to stand on.

So on to the data; here is the won-lost record (in terms of series won and lost, not games) of the teams which fared better in each measure (I’m using color to distinguish between offensive and defensive metrics; it’ll be apparent why in just a bit):

Statistical Category	Series Won-Lost	Type of Statistic
Won-lost record	28-33	Overall
Run ratio (RS/RA)	34-29	Overall
Runs scored	27-36	Batting
Batting average	34-29
On-base percentage	34-29
Slugging percentage	31-32
Doubles	26-37
Triples	29-31
Home runs	32-31
Batters walks	31-31
Batters strikeouts (fewer)	37-26
Stolen bases	35-28	Base-stealing
Stolen base attempts (more)	36-27
Net stolen bases (SB-2*CS)	27-35
Stolen base percentage	27-36
Caught stealing (fewer)	26-36
Runs allowed	35-28	Pitching
ERA	34-29
Pitchers strikeouts	38-25
Pitchers walks (fewer)	32-30
Hits allowed (fewer)	41-22
Home runs allowed	37-26
Complete games	31-28
Pitchers shutouts	38-19
Saves	31-28
Saves by team leader	29-31
Bullpen ERA *	34-29
Errors committed (fewer)	38-24	Fielding
Defensive efficiency *	32-31
Fielding double plays	29-31

The wins and losses don’t always add up to 63 because of cases where both teams tied; for instance, the 2000 Mets and Cardinals each walked 675 times, so their NLCS matchup is ignored for that category. For stats such as batters’ strikeouts where it’s ambiguous whether a higher or lower total is more desirable, the “fewer” or “more” indicates which one is considered the leader.

Well, this is quite interesting, as some of the results are very counter-intuitive; who would’ve guessed that the higher-scoring team would go only 27-36 in the playoffs? And the biggest head-scratcher is the 28-33 record for the team with the better regular-season record (feel free to use those tidbits in bar bets; just cut me in on a percentage).

I already identified a couple reasons why these records are unreliable, and in some cases downright misleading. One thing I noticed is that in many cases, the two teams are pretty evenly matched in a particular category (like the aforementioned Yankees and Red Sox). Cases like that show up in the won-loss records above, though they don’t really tell us anything (especially because the numbers are affected by a myriad of factors, like home parks, schedule vagaries, etc.).

So let’s throw them out; for instance, instead of looking at all cases where one team steals more bases than the other, even if it’s only by a couple of bags, let’s look only at cases where one team steals 30 more than the other. This will leave us looking at just the series where the teams had a meaningful difference in skills.

I didn’t pick the number 30 out of thin air; I went through each category and found the gap that would weed out roughly half of the series. After doing that, here is the same table, with one new column (the minimum spread between teams for the series to be included here), and now sorted by playoff success:

Statistical Category	Minimum Gap	Series Won-Lost
Hits allowed (fewer)	70	24-9
Errors committed (fewer)	12	24-10
Batters strikeouts (fewer)	65	22-10
Pitchers shutouts	2	22-11
Runs allowed	55	22-12
Home runs allowed	20	20-11
Stolen bases	30	21-12
Complete games	3	22-14
ERA	0.4	20-13
Defensive efficiency *	0.01	19-13
Pitchers strikeouts	90	18-13
Stolen base attempts (more)	35	18-13
Won-lost record	5	18-14
Saves by Team Leader	9	18-15
Run ratio (RS/RA)	0.1	16-15
Triples	5	17-16
Batters walks	60	16-16
On-base percentage	0.012	15-16
Bullpen ERA *	0.3	16-18
Pitchers walks (fewer)	50	15-17
Batting average	0.01	15-18
Saves	5	15-18
Fielding double plays	12	15-18
Net stolen bases (SB-2*CS)	20	15-20
Slugging percentage	0.025	12-18
Doubles	18	13-20
Stolen base percentage	0.05	14-22
Runs scored	65	12-19
Home runs	32	13-21
Caught stealing (fewer)	10	12-21

Now some of the quirks are eliminated; the team with the better regular-season record goes 18-14, about what we’d expect. But the most striking thing about this list is that it supports the old adages: you win in the post-season with pitching, fielding, and speed. Eleven of the 12 most important categories (by this crude measure) demonstrate skill on the mound, in the field and on the bases. Obviously some of those categories are inter-related (a gopherball is not just a HR allowed, but also a hit allowed, at least one run allowed and it ruins a shutout), but their dominance on this list is remarkable.

Only after these categories do we get to the measures of overall team success (won-lost record and run ratio) and then all the batting categories. Batting prowess (and power specifically) look completely irrelevant, as the teams that score more runs and hit for more power (whether measured by home runs, doubles, or slugging percentage) have done quite poorly in the post-season.

Interestingly, the only batting category that shows as a strong indicator of post-season success is batters’ strikeouts — the one category that sabermetricians have long called meaningless. I initially didn’t consider this alarming, because HR and K are highly negative-correlated; the players who knock a lot of balls over the fences also whiff more than their share of the time. So strikeouts and home runs would have to balance, I thought; once you know how poorly homers show up on the list, it’s not additionally surprising that contact hitting shows up so high.

But then I looked at the data, and while strikeouts and home runs are strongly related for individuals, that’s not the case for teams; the team with more home runs than its opponent struck out more often only 33/63 times (another way to put this: the correlation between home runs and strikeouts among playoff teams is only .091 — virtually nothing). So maybe the statheads have been missing something.

The other stats at the bottom are times caught stealing and stolen base percentage (which are obviously related). Teams that run themselves into outs during the regular season are winning in the post-season; so it looks like maybe speed and daring are more important than judicious decision-making.

Interestingly, a strong bullpen (as measured by bullpen ERA) and a dominant relief ace (represented by a big gap in saves by the team leader) don’t show up as important. One thing we know is that the top closers pitch far more innings in the post-season than they do in the regular season, and thus have a greater impact.

All of this runs counter to most “stathead” thinking. The usual thought process is that a run earned is just as valuable as a run saved, to close approximation. Well, maybe that approximation breaks down in the post-season. More importantly, it does appear that the stronger pitching in the playoffs neutralizes some offenses, and differences in pitching, fielding and speed show up more.

Now that we know what recent series winners have excelled at, it’s tempting to turn those into recommendations, or to grade actual teams against those criteria (note that Beane’s most recent incarnation of the Athletics placed among the league’s elite in many of the most telling categories). However, it’s important to remember that correlation does not imply causation; recent winners have had certain traits in common, but it doesn’t mean they won because of those traits.

I’m not pretending that this article answers the question posed in the title. This is only a start down the road to determine how to build a better playoff team. But I’ll follow this up with additional tools that try to answer it, a more rigorous statistical analysis of this data, and a look at how these findings make Bill James look downright prophetic.

References & Resources
Most of the statistics used here come from The Baseball Archive’s fabulous Lahman database. A few were culled from the equally-fabulous Baseball-Reference.com.

I appreciate the help from other members of The Hardball Times staff in choosing what categories to examine and how to present the data. Special thanks to Craig Burley and Studes.

Bullpen ERA: I didn’t have access to starting/relieving splits for this entire period. Instead, I approximated bullpen ERA by using the composite ERA of all pitchers who made at least 2/3 of their appearances that season in relief. This means that each team’s bullpen ERA misses a few relief appearances and catches a handful of starts, but shouldn’t impact the outcomes.

Defensive efficiency rating measures how effective a defensive unit turns balls in play into outs; basically it’s (non-strikeou) outs divided by balls in play. I didn’t have access to ball-in-play data, so I approximated DER as (IP*3-K)/(IP*3+H-HR-K-DP). Since I used the same approximation for every team, and we’re only interested in the relative differences between teams, it’s unlikely that this approximation affected the results at all. It’s impossible to completely separate the impacts of pitchers from fielders; DER is an attempt to isolate the fielders’ impact, while K, BB and HR allowed measure fielding-independent outcomes.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG