Whither the Closer? Part Twoby Steve Treder
May 10, 2005
All righty then. We’ve boldly raised several questions over the past few weeks, and meekly failed to provide any definitive answers. So this week it’s put-up time.
So what exactly were all those questions, again?
Regarding the LOOGY:
Scourge of the modern era?
Bizarre mutation metastasizing in today’s bloated, hyper-specialized bullpen?
Okay, that’s really one question, that boils down to this: is the modern usage of LOOGYs dumb?
And this question:
Or perhaps … clever innovation by progressive managers? Sensible adaptation of resources to meet contemporary challenges?
Boils down to: is the modern usage of LOOGYs smart?
Our other questions were regarding the Closer:
1. Is this usage pattern a sensible, optimal way to deploy what is almost always the most highly skilled (and certainly highest-paid) pitcher in the bullpen?
2. Will we be seeing a similar stat line from the key reliever in the average bullpen 10 years from now, or 20 years from now? If not, how might the usage pattern differ from this one?
Asking the Questions is the Easy Part ...
All right. To determine some answers, let’s start with the interrelationship of LOOGYs and Closers. The reason we’ve intertwined the LOOGY series and the Closer series is hopefully clear: it’s necessary to consider both phenomena together, as two elements within the single dynamic of modern bullpen usage. Here’s why:
- The dramatic change in bullpen deployment from the mid-1980s to the early 1990s consisted of severely cutting back on the number of innings thrown by each team’s best reliever (formerly called the Ace, now called the Closer).
- This development exposed many more innings needing to be covered by the rest of the bullpen (since starters weren’t going any longer, and indeed were being worked less and less deeply into games).
- This increased innings load for the bullpen was not filled by having the next-best reliever or relievers (in modern parlance, Setup Men) handle longer stints (indeed their stints have also gotten significantly shorter since the 1980s).
- Instead, the increased innings load has been filled by adding more relief pitchers to the bullpen, and working all the pitchers in a high-appearance, short-stint mode. These additional relievers, the new commodity that hadn’t previously existed, have been embodied primarily in the form of LOOGYs.
The Zero-Sum Game of Roster Spots
Carrying additional relievers in the bullpen necessitates carrying fewer position players on the bench, because rosters haven’t gotten any bigger. This is a significant issue. Advocates of the modern bullpen are often quick to dismiss the importance of removing a couple of bats and gloves from the manager’s tool box (who needs another weak-hitting backup catcher? or words to that effect), but to do so is to exhibit unimaginative thinking.
The fact that teams might not make good use of a deeper bench doesn’t mean they cannot. It seems clear that there's often more to be gained than lost in striving to do so, or at the very least not eliminating a good deep bench as a possibility. History is filled with examples of teams that made very creative, productive use of 7-man (or even larger) benches, not only with useful extra bats but also with extensive use of defensive-replacement specialists. Taking a quick look through the years just before bullpens began to grow at the expense of benches, we can see lots of teams demonstrating complex multi-position arrangements that are effectively impossible to operate under the modern 11- or 12-man pitching staff protocol: just as examples, the 1979 Orioles, the 1980 Yankees, the 1984 Tigers, the 1984 Phillies, the 1985 Twins, the 1985 Reds, and the 1987 Giants.
Looking back in earlier periods, examples abound as well: among many, the 1971 Pirates, the 1969 Mets, the 1964 Phillies, the 1961 Dodgers, the 1958 Indians, and the 1953 Yankees.
Obviously every bench isn’t as good as these, but the point is that a shorter bench eliminates a range of choices from the manager’s disposal. It’s easier to platoon at multiple positions with a deeper bench, because multiple pinch-hit substitutions can occur without depleting the bench of the essential reserves needed to be kept on hand in case of extra innings and/or injury (always a catcher, and often a middle infielder as well). Bear in mind that platooning at just one position is likely to gain the left-right platoon advantage in at least 350 or 400 plate appearances more than going with a right-handed-hitting regular; bear in mind that the typical LOOGY enjoys the platoon advantage in significantly fewer than 100 plate appearances a year.
In addition to the left-right platooning issue, with a deeper bench it’s easier to pinch-hit for weak-hitting good-glove regulars, and it’s easier to deploy late-inning defensive replacements, and it’s easier to pinch-run in key late-inning high-leverage situations. Whether the cost of all this is judged to be great or small, it does exist, and this cost is paid by every modern team.
The cost is expended in exchange for what the teams obviously presume to be a better bullpen. Clearly it’s a deeper bullpen, but does it in fact tend to actually be better than the earlier styles of bullpen, and sufficiently better to justify its cost? To answer this we should consider the initial motivations for the introduction of the Closer, and the dramatic increase in the deployment of the LOOGY, in the late ‘80s and early ‘90s.
Pitchcounts, Workloads, and Injuries
The Closer model was not developed because there was a sudden general aha! that lopping 30+ innings off the contribution of each of the best relievers in the ‘pen, and handing those innings off to lesser-quality pitchers, would in itself enhance run prevention, and least in the short run. The opposite result -- run prevention degradation -- is the clearly logical likely outcome, again at least in the short run. No, the Closer was derived as a means of limiting Ace Reliever workload in the belief that doing so would result in injury prevention of this important, and increasingly expensive, resource.
Remember that it was the injury/fatigue issues encountered by Bruce Sutter in 1977-78 that prompted the modification of his usage pattern into something anticipating the Closer. Bear in mind as well that the mid-to-late 1980s was the period in which the workloads of top starting pitchers were substantially reduced as well, a development we examined at length here and here. The final step from the Sutter-inspired 90-to-100-inning pattern to the full-blown 70-to-80-inning Closer pattern took place at the same time, and was obviously driven by the same objective: the preservation of the health of a team’s best pitchers. The logic was that the run prevention lost by limiting seasonal innings of both top starters and top relievers would be repaid by the run prevention gained by reducing injuries (and/or fatigue-driven ineffectiveness).
This is all entirely sensible, theoretically. However, empirically, there are two problems with it. First, nearly two decades into the strict-limitation-on-top-pitcher-workload era, we have yet to discover any compelling evidence that the rate of injuries to pitchers has actually been reduced, as we discussed here ("The Actual Injury Rate Question," about two-thirds of the way down). Second, as has been argued by Bill James and Don Malcolm, it may well be the case that unduly limiting the workload of fully mature pitchers actually increases their susceptibility to injury, instead of reducing it. No one knows, of course, but at any rate the demonstration of injury-reduction efficacy through the modern limitation on top pitcher workloads has been anything but clear.
The LOOGY as Second-Order Effect
But it is clear that it was the quest for injury prevention (which is, of course, an economic consideration at least as much as an on-field tactical one) the drove the Closer innovation. Such was not the case with LOOGYs, who weren’t highly-paid star attractions, but were typically either last-phase-of-the-career veterans or marginal talents. The LOOGY explosion occurred as a consequence of the Closer revolution: fewer innings from the top reliever (and top starters) required more innings from the rest of the bullpen, and additional LOOGYs were invoked as a major part of the effort to fill the gap. If secondary arms are going to be pitching the sixth, seventh, and eighth innings anyway, so the logic would seem to be, we might as well try to gain the platoon advantage as much as possible. (The notion that transferring players from the dugout to the bullpen to fill these LOOGY roster spots likely results in a net reduction of overall platoon-advantaged plate appearances seems not to have been a concern.)
The Modern Offensive Boom: Cause or Consequence?
One argument that’s often heard in defending the modern pitching staff usage model is that it’s been a response, a rational reactive maneuver to counter the very high-scoring conditions that have been in place since the 1993 season. While there is no reason to doubt that high run volume serves as a brake on high-inning workloads – more runs means more batters, more pitches, and more high-stress pitches from the stretch – the problem with this argument is that it’s factually incorrect: the revolution in pitching staff management preceded the mid-1990s offensive boom. By any metric – the workloads of top starters, the usage patterns of top relievers, or the deployment of LOOGYS – not only the trend, but to a great extent the full fruition of the complete new model, was put in place in 1988-1992, a low-scoring period, before the offensive explosion that began in 1993. This is further evidence that the primary driving factor has been the pursuit of economic risk avoidance regarding big-star pitcher contracts (whether successfully achieved or not), more than a tactical response to conditions on the field.
The offensive bonanza that has occurred since 1993 must therefore be considered, at least potentially and partially, as itself a consequence of the historically low proportion of innings worked by the game’s best pitchers (both starters and relievers), as much as a cause. Obviously this is chicken-and-egg territory, but to disregard this point would seem to be ignoring a very logical possibility.
Here's another fact to consider in this regard, and it gets back to the issue of roster construction. Twenty years ago, there were 26 major league teams, and the typical roster configuration included 10 pitchers and 15 position players; that added up to 260 pitchers and 390 position players. Today there are 30 major league teams, and the typical roster configuration is 12 pitchers and 13 position players -- that adds up to 360 pitchers, but still only 390 position players.
I'm as strong an advocate as anyone that the talent pool feeding major league baseball is larger and more vigorous today than ever before; I have little doubt about the talent pool's 20-year growth capacity to accommodate the staffing of four new teams with 100 players without a loss in overall quality of play. However, I'm quite doubtful that every single one of the best of these 100 new players is a pitcher. I'm quite doubtful that adding something close to 100 new pitchers to the major league mix, and something close to zero new position players, has been achieved while keeping the relative quality of pitching and hitting in perfect balance. It seems very plausible that the extremely imbalanced infusion of additional talent into the game has served, to some degree, to render the hitting talent in MLB the most exclusive and high-caliber it has ever been, while the pitching talent, being significantly less exclusive, has struggled to keep up. In this way, the mainstream's lament that expansion has "watered down pitching staffs," while generally not a very well-informed or well-thought-out explanation of events, ironically has an element of truth to it -- though it isn't expansion per se that has been doing the watering down.
ERA+, and Other Rate Stats
We didn't focus much attention on it last week, but we did include average ERA+ performance in our examination of top Save and top Save Plus Win producers since 1960. And one clear trend in the ERA+ data we presented is that over time, as average stint lengths have gotten shorter, the average ERA+ of top relievers has gotten better.
This is a good thing, isn't it? Well, of course it is -- as far as it goes, and understanding what it means. It is, unambiguously, a demonstration of the principle that it's easier to pitch effectively in shorter stints. On a per-inning basis, there's no reason to doubt that Closers are the most effective group of pitchers yet assembled. That's the good news.
Here's the not-so-good news: effectiveness on a per-inning basis isn't the end of the story. Baseball games are nine innings long, no matter what portion the Closer handles. In a 162-game season, a pitching staff has to handle something close to 1,450 innings, no matter what portion the Closer handles. No matter how effectively he pitches his innings, those that he doesn't pitch will be pitched by someone else. Therefore the improvement in rate-stat effectiveness in the transformation from Ace Reliever to Closer has to be balanced against the significant reduction in innings contribution.
And here's something important to consider in that light: ERA+ is another one of those zero-sum games. By definition, the league ERA+ every year balances out to exactly 100. So, since the average ERA+ of the key reliever in the bullpen has improved, it's inescapable, it's a mathematical necessity, that the average ERA+ of other pitchers on the staff has declined, to an exactly proportional degree. A cost of coddling the Closer, protecting him in a usage pattern in which he can happily optimize his rate stats, is that other, lesser pitchers are required to pick up the slack. Those lesser pitchers are, of course, the pitchers who weren't in the major leagues 20 years ago -- the 11th and 12th pitchers on each staff, a high proportion of them LOOGYs -- as well as the one and only member of the pitching staff who is extended into a greater innings workload than he typically was 20 years ago: the fifth (otherwise known as the "worst") starter. Thus it isn't at all clear that improved ERA+ performance from the top reliever in the bullpen, given that his innings contribution is severely reduced, yields a net improvement in overall pitching staff performance.
Another point that’s frequently brought forth as a point in favor of the Closer model is that while the Closer’s overall innings load is limited, those innings he does work yield an extremely high leverage toward the outcome of close games. This is another argument that would be persuasive if it only had, you know, the benefit of factuality behind it.
The fact, as Studes’ graphically-presented data so vividly illustrates here, is that the Closer usage pattern actually doesn’t deliver especially efficient P value (a measure conceptually very similar to Tangotiger’s Leveraged Index). As Studes also shows (and his Bullpen Book documents), in the modern bullpen other relievers, including LOOGYs, often wind up with P values comparable to the Closer's.
Sometimes they even exceed it. In the 2004 Cincinnati Reds' bullpen, Closer Danny Graves (41 Saves) had a P value of .081, while Setup Man Todd Jones' (1 Save) was .090. For the 2003 Kansas City Royals, Closer Mike MacDougal (27 Saves) had a P value of .058, and Setup Man Jason Grimsley (0 Saves) came in at .073. In 2002, the P value of New York Mets' Closer Armando Benitez (33 Saves) was .079, while Setup Men David Weathers (0 Saves) and Scott Strickland (2 Saves) were at .088 and .079 respectively, and LOOGY Mark Guthrie (1 Save) had a P value of .072.
One-run, one-inning Save opportunities are extremely high-leverage situations, and Closers are appropriately used there. That’s the good news. The bad news comes in two categories: first, late-inning tie games are nearly as high-leverage as one-run Save opportunities, and Closers are rarely used then; and second, two- and three-run lead Save opportunities aren’t nearly as high-leverage, and Closers are used then all the time. It’s easy to imagine a usage pattern for the top reliever that would yield more leverage than the Closer mode, and simultaneously take a lot of the high-leverage appearances away from LOOGYs and other varieties of less-than imposing Setup Men, and there's no getting around the fact that such a usage pattern would resemble that of the Ace Reliever or Fireman.
Answer Time: LOOGYs
So, let’s get to offering some opinions, already.
Question: Is the modern usage of LOOGYs dumb, or is the modern usage of LOOGYs smart?
Answer: It isn’t completely dumb, but it has more dumb than smart in it. I certainly wouldn’t say that no team should ever deploy a LOOGY; one can easily conceive of a combination of circumstances (skill profile of the particular southpaw, skill profile of the particular bullpen) in which using a lefty as a short-stint specialist makes sense. What doesn’t make sense is allowing LOOGYs to consume two slots in the same bullpen. What doesn’t make sense is failing to appreciate the cost of roster spots. What doesn’t make sense is me-too orthodoxy, the mindless conformity of the pell-mell rush to deploy ever-more LOOGYs in ever-more Hard-Core intensity that has occurred over the past decade.
An anecdote from my favorite team, the San Francisco Giants, perhaps best illustrates the folly of unexamined LOOGYism. At the outset of the 2003 National League Division Series, the Giants determined their 25-man roster to face the Florida Marlins. Despite the fact that the Marlins were an almost entirely right-handed hitting team, with nothing resembling a lefty slugger, the Giants:
- Went with a 12-man pitching staff and a 5-man bench
- Decided that not one but two of the relievers on the staff needed to be LOOGYs
This, despite the fact that neither of these LOOGYs – Scott Eyre and Jason Christiansen – was an especially good pitcher. (Neither would prove to be a factor in the series, working a combined total of one-third of an inning.) This, despite the fact that in constructing the roster this way, the Giants opted not to carry Eric Young – a pretty good hitter, versatile defensively, a fast and accomplished baserunner, who could add value in the field, at the plate, or on the bases.
In Game Four of that series, the Giants were trailing, but rallied in the ninth inning. A win in this game would tie the Series at two games apiece, and send the Giants back home to San Francisco for the deciding contest, with Jason Schmidt (who had tossed a dominating 3-hit shutout in the opening game) all set to go. In that ninth inning, the tying run for the Giants was thrown out at home plate in a close play with two outs, ending the series in defeat. The runner thrown out was J.T. Snow, a slow-footed first baseman; a runner with good speed – oh, say, a pinch-runner, someone like Eric Young – would almost certainly have scored the run.
Answer Time: Closers
Question: Is this usage pattern a sensible, optimal way to deploy what is almost always the most highly skilled (and certainly highest-paid) pitcher in the bullpen?
If there were good reason to believe that the strictly limited innings workload of the Closer pattern actually does have a significant effect on injury reduction, then that would be a good argument in support of the Closer, at least on economic grounds. However, there is no clear evidence that injury rates of Closers are meaningfully lower than those of relievers used in the more demanding modes of the past. Given that, the only case for the Closer becomes an in-game tactical case, and I really don’t see a good one. Transferring a large number of innings from the best pitchers in the bullpen to the worst – indeed, adding extra marginal pitchers to the bullpen to accommodate the shift – is just not a compelling blueprint for bullpen effectiveness.
The issue isn’t that no team should ever deploy a reliever in the Closer pattern. As with LOOGYs, it’s certainly possible to conceive of a combination of player/team circumstances that render it a reasonable alternative. The issue is that such a combination doesn’t apply to all teams all the time (or even many teams much of the time), and yet the Closer is, in the current era, essentially deployed by all teams all the time. The problem isn’t so much the Closer pattern itself – the minuses generally outweigh the plusses, but there are some plusses – as the blind fealty to it.
The horror bordering on hysteria that swept the baseball world when the Boston Red Sox toyed with some deviation from the Closer orthodoxy in 2003 – apparently among insiders as well as in the popular media – was vastly more emotional than rational. It was a good indicator of the degree to which rigorous adherence to the Closer model passed the point of being a carefully reasoned choice somewhere around a decade ago, and became instead an article of faith: faith in the causal link between the tightly limited modern workload and a reduced rate of injury (despite no data to support it), and faith in the crucial importance of “the Closer mentality” (despite the fact that such a concept was unheard of over about 120 or so years of major league baseball history).
Crystal Ball Time
Question: Will we be seeing a similar stat line from the key reliever in the average bullpen 10 years from now, or 20 years from now? If not, how might the usage pattern differ from this one?
As we saw last week, the Closer pattern has been the most stable, and the most universally applied, of all top reliever modes throughout history. This point would suggest that it’s here to stay. In a discussion on this topic on BTF a few weeks ago, a well-informed and reasonable poster opined that, all things considered, he doesn’t expect to see significant change in the Closer model in the coming decade or two.
I respectfully disagree. Here’s why.
Let’s review the stat trends we lined up at the end of last week: overall major league Saves per game, the percentage of those Saves garnered by each team's top Save producer, overall major league Complete Games per game, and overall major league Runs per game.
Year Sv/G Sv/T Sv CG/G R/G 1986 23.9% 58.3% 13.8% 4.41 1987 23.1% 53.3% 13.3% 4.72 1988 25.0% 63.9% 14.8% 4.14 1989 25.4% 66.6% 11.5% 4.13 1990 26.4% 62.6% 10.2% 4.26 1991 26.9% 61.6% 8.7% 4.31 1992 26.3% 65.5% 9.9% 4.12 1993 26.3% 72.2% 8.2% 4.60 1994 24.3% 64.5% 8.0% 4.92 1995 24.9% 73.5% 6.8% 4.85 1996 24.6% 74.4% 6.4% 5.04 1997 25.1% 70.4% 5.9% 4.77 1998 26.0% 72.6% 6.2% 4.79 1999 25.1% 73.2% 4.9% 5.08 2000 24.2% 73.1% 4.8% 5.14 2001 24.9% 75.0% 4.1% 4.78 2002 25.2% 80.6% 4.4% 4.62 2003 24.7% 70.1% 4.3% 4.73 2004 25.3% 75.2% 3.1% 4.81
What do we see here? Closers have been used more and more, garnering an ever-greater proportion of team Saves – but Saves overall aren’t being produced at a meaningfully improved rate, if at an improved rate at all. This, despite the fact that Complete Games are dramatically plummeting – providing more opportunities to produce Saves than ever before. And as we examined above, the trends in pitching staff deployment led the mid-1990s offensive boom, rather than responding to it, and the continuing changes appear to have had little, if any, influence on scoring levels.
The continuing changes are the key here. There are obviously ways in which the Closer mode has been quite stable since the early 1990s. But there are other ways in which overall pitching staff deployment patterns have been anything but stable in this period: just since 1993, eight new all-time records have been set for fewest major league Complete Games, including yet another in 2004. Major league starters are less than half as likely to go nine innings now than they were just a decade ago. The idea that we’ve reached stasis in pitching staff deployment, that major elements are no longer in flux, is just not borne out by the facts.
Given that pitching staff usage hasn't stabilized, and given the authentic shortcomings of the Closer/LOOGY paradigm which we've now had more than a decade to observe, the likelihood of the continuous change process suddenly coming to a standstill at this precise juncture must be considered exceedingly low.
Here’s another thing: the bullpen deployment of three prominently “sabermetric” teams over the past few years doesn’t indicate satisfaction with the Closer status quo. We’re all familiar with Boston’s egregiously mis-named and appallingly misperceived “Bullpen by Committee” experience of early 2003; what the mainstream media doesn’t appear to have noticed is that the Red Sox have yet to retreat to the garden-variety Closer model. The usage patterns of both Byung-Hyun Kim in 2003 and Keith Foulke in 2004 demonstrate a longer average stint length and a less-strict Save focus than the modern norm.
The Oakland Athletics used Billy Koch in 2002, Foulke in 2003, and Octavio Dotel in 2004 in modes yielding slightly more innings per game (1.12, 1.20, and 1.13 respectively) and dramatically more Wins (11, 9, and 6 in a partial season) than seen in the average Closer.
The Los Angeles Dodgers in 2004 used consecutive-Save king Eric Gagne in a noticeably different manner than that of previous seasons, garnering him sharp upticks in innings per game (1.18) and Wins (7).
These three teams may well be on the vanguard of a broader impending move away from the Closer straightjacket. Or, of course, they in particular may not, but in any case their willingness to "touch the third rail” of bullpen convention, if only tentatively, suggests that the long era of remarkable conformity may be reaching its culmination. History strongly indicates that, sooner or later, the only constant is change. There are numerous reasons to expect that this eternal pattern will reveal itself here again.
To assert this is not to say that what will likely happen is a return to the precise Ace Reliever pattern, or that that’s what should happen. Far more probable, and far more reasonable, is the innovative development of a new pattern of bullpen deployment, something we haven’t quite seen yet. As such, we can’t describe it today in detail. But to say that a new development of some kind won’t likely occur is to make a statement that would have been proven wrong at every previous point in history. Perhaps this is the historian’s bias, but I don’t think we’ve yet reached the point at which the wheels of change are no longer turning.
Steve Treder can often be found spending way too much time talking baseball at Baseball Primer. He welcomes your questions and comments via e-mail.