Over the past two months, I’ve been delving into manager ejections. In February, I looked at the most and least-often ejected managers, and invented the Eject+ metric, normalizing ejection frequency to league rates. In March, I measured ejection rates against on-field success and longevity in the dugout, finding that the fires of outrage do dim with time and that Joe Torre had more similarity to Billy Martin than just success in pinstripes.
Today, in the final installment, I’ll be looking at the effect of manager ejections on the games themselves. How much likelier are teams to be losing, and to lose, when their manager gets run out? Do teams out-perform or under-perform their expectations once the manager is gone? And are there managers who give their teams a boost by getting the heave-ho?
How Things Go When They’re Going Badly
The list of managers I studied in this part of the series isn’t as star-heavy as the one in the previous chapter, because I had matters of balance to consider. Picking all famous managers would mean picking mostly winning managers. I worried that this would skew the results, make getting ejected look like more of a winning strategy than it is.
I found another way to adjust for this later on, which I’ll detail below. My starting solution, though, was to balance the managers by winning percentage, games and ejections. For each big-name winning manager, I would find someone (usually more than one) who lost by a similar margin in about as many games with about as many thumbs. The method wasn’t perfect, but it worked out pretty well.
I studied 25 managers, eight with winning records, 16 with losing records. (Remember, losing managers don’t last as long; you need more of them to balance the winners.) The 25th was a gift. With 3,060 regular-season games in his career through the 2013 season, Bruce Bochy has a lifetime record of 1,530-1.530. No way was I going to neglect more than 3,000 games and 60 ejections that came pre-balanced.
As I used Win Expectancy in part of the study, they had to be relatively recent managers to fall into Baseball-Reference’s coverage of that statistic. The earliest any of them managed was 1960. This is a pity in one sense, as I’d love to see how John McGraw or Leo Durocher shook out. On the other hand, the group I have is probably more indicative of current trends, and thus tells you more about the game today.
These are the managers I used, listed from most games managed to least. The short-timers, usually losing managers, were picked mainly to balance long-time managers, not generally for their own traits. Most do have a pretty healthy ejection rate.
Bobby Cox, Gene Mauch, Lou Piniella, Bruce Bochy, Dick Williams, Earl Weaver, Mike Hargrove, Billy Martin, Phil Garner, Jim Riggleman, Jeff Torborg, Don Baylor, Buddy Bell, Hal McRae, Larry Bowa, Frank Lucchesi, Larry Rothschild, Alan Trammell, Dave Trembley, Brad Mills, Terry Bevington, Dale Sveum, Sam Perlozzo, A.J. Hinch, Tim Johnson.
This group’s overall record is 19,210 wins and 19,210 losses, as balanced as you can get, and they had 834 career ejections. In those 834 games, their teams went 293-541, for a winning percentage of .3513. I expected a losing mark in ejection games, but this was pretty harsh.
As the last installment showed, ejections are not always evenly distributed over a manager’s career. I ran the numbers again, counting the manager’s season winning percentage for each ejection, to catch any deviation from .500. I found a slight one, upward. Were their ejections perfectly aligned with their seasonal performance, we would see a record for the ejection games of 420.694 wins and 413.306 losses, a .5044 win percentage. Whichever baseline we use, it’s usually a bad day when a manager gets tossed.
There are obvious selection-bias reasons for this to be true. An ejection will very often, though not always, follow an adverse event for the ejected manager’s team that lowers its WE. Also, a manager’s temper is more likely to be inflamed when the game is going poorly. From this, there follows a chance that ejections happen more against superior opponents, which would lower the expected record in these games. I did not pursue proof, but the possibility is there.
The above chart plots the winning differential against total ejections by the managers. (I used modified win percentage expectations here: going by lifetime numbers wouldn’t change things markedly.) Observe how the data points converge toward the mean differential of -.153 as games rise. The graph counting by games managed rather than ejections looks much the same, with less compression toward the left. The outlier way at upper left is Tim Johnson, who won the lone game he was ejected from.
(Johnson, for those who don’t recall, managed the Toronto Blue Jays for one year, going 88-74 in 1998. He was then fired for having falsely claimed, among other things, to have served in combat in the Vietnam War. It is ironic that a Canadian baseball club fired him for lying about American military service. It takes all kinds to make my numbers balance.)
The manager whose team went south fastest in games he didn’t finish was Don Baylor, who was .476 lifetime (.455 modified) but just 2-10 in ejection games. A close second, not surprisingly, is Billy Martin. He was 12-34 in ejections against an overall .553 record, which dropped to .535 modified. You will recall that Martin strongly tended to get thrown out more in years when he struggled, so this is a natural result for him. Baylor had fairly strong tendencies that way also.
Four managers out of the 25 had better records when they were ejected, but in samples too small to prove very much. The largest one was Alan Trammell’s: he went 5-7 in ejection games while managing the Detroit Tigers to a galling three-year record of .383. If we use the modified manager records, though, that better record disappears. The others (whose results survive the modification) were Jeff Torborg for nine ejections, Brad Mills for seven, and Tim Johnson for one.
Vital statistics for this part of the study are in the chart below. There you can marvel at how lousy Jim Riggleman’s teams were when he got heaved, or at how Jeff Torborg small-sampled his way to a winning mark in ejection games. You can sniff at how my modifications never add up even to one win’s difference in expectations. Or you can skip it. I’ll be waiting for you at the end.
|Manager Records and Expectations in Ejection Games|
Better Off Without You?
Weaver’s Tenth Law: The job of arguing with the umpire belongs to the manager, because it won’t hurt the team if he gets thrown out of the game.
—Earl Weaver with Terry Pluto, “Weaver on Strategy”
The above quotation by Weaver, and the discussion of his umpire confrontations surrounding it in the book, was one of the spurs to my wishing to dig into the matter of manager ejections. By other things they write, we can see Weaver exaggerates a bit here for clarity, but he’s firm that losing a manager to ejection is far less injurious to a team’s prospects than losing a player. It was the example of another manager, Lou Piniella, that got me wondering whether his departure could actually be helpful.
In Chris Jaffe’s Evaluating Baseball’s Managers, there is an anecdote about an ejection Piniella had in 2007, his only one that season. His Chicago Cubs had been underperforming badly, and were about to lose their sixth straight. On a close call at third, Piniella went calculatedly ballistic, putting on a classic Sweet Lou show. The histrionics served to loosen up his players, and though they lost that game, they tore off a 29-13 streak immediately afterward, and made the playoffs.
Piniella’s act broke clubhouse tension, drew media scrutiny to himself rather than his flailing team, and gave his players an example of on-field passion to emulate. The first two are longer-term effects, but the third could plausibly spark a team during the game in which the manager gets tossed. It may be a cliche that some emotional event can turn a game around, but cliches get their start from somewhere.
So I looked at whether a manager’s team does better or worse than expected after he is thrown out. This means totaling up the Win Expectancies of the team at the point of ejection and comparing the sum to the wins and losses actually achieved. Of course, it also involves more.
There’s an obvious pitfall with using Win Expectancy as I intend to here. Win Expectancy is based on averages: a certain situation of scores, bases, and outs will produce an average likelihood of winning a game, and a specific play result will cause an average change that leads to a new average probability. This is necessary so the system isn’t overburdened with complexity, but there is a cost.
Teams, and the players comprising them, are not idealized average constructs. A 2013 game between the A’s and the Astros was not a 50-50 proposition going in, but according to WE it was. A bases-loaded, one-out situation with the cleanup hitter coming to bat is not the same as one with the eight-hitter due up. I could ramble on, but the point is made: real teams don’t behave the same as WE assumes they do.
So when a game log says that, for sake of argument, Earl Weaver’s Orioles had a 50 percent chance of winning a tied game when he was ejected after the fifth inning, the real probability was something different. It was very likely higher, since Weaver had a .583 lifetime winning percentage. To do this study properly, I had to make adjustments similar to the seasonal record tweaks above in order to discover, or at least approximate, the real percentage.
I created an adjustment to Win Expectancy at the moment of ejection based on two things, the manager’s record with the team that season and the progress of the game. I took the margin of the manager’s winning percentage above or below .500, and multiplied it by the proportion of outs remaining in the game when he was ejected, N divided by 54. I considered extra innings to be a repeat of the ninth.
This gave me a factor to represent the presumed influence of team ability on the game. I combined that with the Win Expectancy at the point of ejection, with greater or lesser effect as game odds were closer to or farther from 50-50. Without that adjustment, one could get modified WEs above 100 percent or below 0, which is obviously wrong. This produces a modified, “expected” WE for the team.
(Had I infinite time, I could have included adjustments for the strength of the opposing team, or for home-field advantage, or figured whether each out really does move the game a fixed proportion closer to its end. Infinite time is overrated. Besides, I’d probably waste it all by insulting everybody on Earth in alphabetical order.)
So I went to the ejection records at Retrosheet, and discovered one small hiccup and one large one.
Some ejections happen in the midst of plate appearances rather than between them. The balls and strikes on a batter shift winning probabilities, but Win Expectancy does not reflect this. In these situations, I had to use the WE of the last resolved play before the ejection. This was the small hiccup.
Most games in the time range I examined have the point at which a manager was ejected specifically recorded in Retrosheet’s recap of the game. But many, sometimes quite a lot, did not. Often, the recap would open with a bold-type note saying “Manager X was ejected sometime in the game.” Sometimes, that note would come later, indicating a specific inning but giving no specific point in that inning. On rare occasion, the play-by-play would say nothing at all.
These results were useless to me. For a while, I considered using the inning-only notices, taking the range of Win Expectancies during that inning, For some action-light innings during blowouts, this could have worked. In other, tighter contests, the WE could range from 11 to 83 percent in that frame alone. (This happened with Bobby Cox on June 3, 1978, in a Braves-Cubs game. The page shows it 17 to 89 percent, as it’s giving the Cubs’ WE.) I reluctantly had to give up this salvage method.
For a few of the managers I looked at, half of their games had fuzzy, or worse, ejection times. Mike Hargrove was missing 25 of his 50 dismissals, Hal McRae 10 of his 20, and Larry Rothschild 7 of his 15. I had to leave them out of my final results, for fear that some bias controlling which ejections weren’t reported would warp the numbers. For the managers who stayed, I had to tolerate drop-out rates of up to one-quarter, or I simply would have had to throw out vital portions of my data and start blindly searching for replacements with lower miss rates.
I did have the resources to recover some of the missing information. I will explain how as prologue to a bit of special pleading at the bottom of this article. For now, enough of my travails. Time to show you what this method looks like in practice.
Dale Sveum managed the Chicago Cubs for two seasons (plus 12 games with the Brewers in 2008), and suffered 10 ejections. His Cubs went 2-8 in those games. That’s obviously bad, but if the team was consistently on the short end when those ejections came, it would look less like his players were deflated rather than energized by his ejections. This is how those ejections look.
(WE given in percentage for the manager’s team. Inning column has innings and outs. T-top; B-bottom; M-middle; E-end. WE-Mod is the modifier I add for season win percentage and game stage; WE-Adj is the resulting adjusted WE.)
|Dale Sveum’s Ejection Games|
Sveum, who went .377 in 2012 and .407 last year, gets some benefit of the doubt from those poor teams, for what little it’s worth. Even after adjustment, the Cubs still gave up close to two wins out of 10 against Win Expectancy in his absence. That underperformance rate is unmatched in this survey, though admittedly for a limited sample.
All the modifications are negative because Sveum’s season records were always under .500. A winning manager would get positive modifications.
I won’t cram this page full of game-by-game rundowns like that. The chart for overall results from the 22 managers I could use will be big enough, as you’re about to see.
Managers are listed in order of total ejections. Next to that is the number of ejection games I was able to use. WE Wins are the total wins we’d expect by adding up Win Expectancies, the next column adjusts them for winning percentage and game progress, and Real Wins are the games the team did win. I then show how many standard deviations the performances are above or below expectations, for unadjusted and then adjusted WE Wins.
|Ejection Game Records vs. Win Expectancies|
Early on, when I found Earl Weaver went 38-42 in (usable) ejection games when WE expected just 32.5, I thought I had discovered fresh proof of Weaver’s genius. Probability is not as easily impressed as I was. Two standard deviations are considered necessary for a statistically significant result. Weaver managed 1.25 SD unadjusted, and once I took the overall great record of his teams into account, it dropped well below one SD.
Indeed, none of the managers has a result greater than two SD, either positive or negative. This is actually somewhat unexpected. Two standard deviations covers 95 percent of an expected range, so we would expect one out of 20 managers to fall outside it. Instead, zero out of 22 do.
The closest we get to an outlier is the king of ejections, Bobby Cox. Post-ejection, his teams win more than 10 games above their expectations by base WE, and more than eight and a half after adjusting for record and such. He doesn’t quite get to two standard deviations, even unadjusted, but he still has a bigger differential than all the rest, including underachievers like Sveum. If there’s any one manager who had the knack for energizing his team by getting tossed, Cox was the man, but the case isn’t, quite, proven.
Taken as a whole, this group of managers went 229-382 in ejection games I could study. We would expect 220.46 wins by unadjusted WE, and 222.34 wins by adjusted values. Ejections thus produce slightly better game results than expected, but not by a wide enough margin for statistical significance. With a bigger study, I might have enough data to find that significance—or the margin might regress and disappear. If I ever find enough hours in the day, I may try it.
Tangents: protection and experience
In Weaver on Strategy, Earl Weaver states that his ejections often had a view to keeping a player from arguing himself into getting ejected. Weaver would interpose himself, do the shouting that the player wanted to but must not, and take the ejection that otherwise would have deprived his team of a key contributor.
However, when I was poring over Weaver’s ejection games, I discovered that time and time again, Baltimore players were being ejected at the same point of the game that Weaver was. Over his 94 ejections, a player went with him 10 times. None were instances of being ejected due to warnings to the pitchers about throwing at hitters. (No huge surprise: Weaver also wrote that he deplored the beanball.)
So all this talk belied the actions. He didn’t practice what he preached! At least, that’s what I thought until I applied the same test to the other managers.
|TIMES PLAYER(S) EJECTED WITH MANAGER|
In this case, I felt safe in counting all the manager ejections, not just ones for which specific times are given. Players are far more sure to have their ejections marked down to the exact instant. If a manager went with them, that would be recorded along with theirs.
Weaver’s rate of getting thrown out with a player is actually one of the lowest of the 25. Only three managers did it less often, none with more than 10 ejections. If this group is at all representative, Weaver was effective in keeping angry players in the game.
This probably didn’t boost his career record much. He prevented around eight ejections against the average by getting ejected himself, and that wouldn’t total up to even one added win. Then again, he had to have made some interventions that didn’t require his taking one for the team, so his exertions would have accomplished more than appears on the surface.
The Piniella anecdote I offered earlier comes from late in his managerial career. He wasn’t getting tossed often then, and it seems probable that he had picked his spot to give the Cubs a boost. In this spirit, I checked for the possibility that experience as a manager might increase his team’s winning chances after he gets ejected, that he was learning better when to get heaved. I measure success by standard deviations over modified Win Expectancies.
There is a hint that overall experience helps, but only a hint. The R^2 number indicates virtually no correlation, though I was not expecting a great one with relatively few managers in the survey. Including the three I previously excluded would raise it by about 0.01, which doesn’t help much.
But am I measuring against the wrong thing? I’m inquiring about how experience helps you pick better situations for an ejection, so perhaps I should plot against ejections rather than total games.
Now the correlation is much stronger: still moderately low by statistical standards, but it certainly exists. The Bobby Cox data point at upper right hikes the trend line upward, but I cannot regard that as an outlier, not when he shows the strongest deviation from the mean over the longest career in the survey. Piniella, by the way, is third from the right, hovering just over the trendline.
So while I cannot definitively state, much as I’d like to, that any individual manager can spur his team to a better chance at victory with a well-gauged ejection, I will back the general case that more experienced managers can learn when to draw the umpires’ ire for the benefit of their squads, or at least figure out when getting the thumb would be counter-productive.
The Future of Ejections
I began this series with some ruminations on the possible future of manager ejections, as the new expanded replay system came into the game. That system has now been working for a couple of weeks, though it’s still in the offing as I write this. You readers, in my future, have the first practical data on this great experiment, which I back in the past currently lack.
Despite this disadvantage, I’m going to give my prediction as to what is now happening, and will happen this season and onward. Ejections are going to fall, and substantially, but the bottom will not drop out. The decrease will be less than 50 percent, possibly as little as 25 percent. I’ll predict an ejection rate in 2014 of 1.25 percent, against a 1.75 percent rate in 2013.
Why don’t I think the rate will truly plummet, with the biggest point of contention between manager and umpire taken away? Primarily because not all of that contention is being removed, and partly because there are so many causes of ejection that don’t involve someone seeing a play differently from the umpires.
The replay regime has been greatly expanded, but it isn’t anywhere near universal. Many plays still may not be reviewed. The “neighborhood” play at second base can’t be touched, likely from players-union fears that enforcing a strict touching of second on the pivot will endanger fielders from take-out slides. Balls and strikes I mentioned two articles ago, and they are a constant source of manager-umpire friction. Other judgment calls by umpires are untouched by replay, like balks, check-swings, and the infield fly. (Remember the 2012 National League Wild Card game?)
In doing my research for this installment, I went through a lot of records for a lot of ejections, and saw the full breadth of causes for them. Balks were a constant irritant: it sometimes seemed that an umpire couldn’t call a balk without having to chuck out a manager as part of the play. Sniping over balls and strikes got plenty of managers tossed, sometimes in the middle of plate appearances. (I mentioned how that gave me problems with the WE.) There are plenty who get ejected along with their pitchers after purpose-pitch warnings. Replay isn’t doing away with those.
Managers have gotten ejected for arguing enforcement of the catcher’s box. They’ve been ejected for smoking in the dugout. Terry Bevington got tossed for putting Phil Garner in a headlock during an on-field brawl. Some managers contrive to get themselves thrown out before the first pitch, and no futuristic replay room at MLB Headquarters is going to forestall it.
Arguments and ejections are one of the spices of baseball: one not to everybody’s taste, but then what is? Those who find it unpalatable can be pleased that there will be less of it. Those for whom a good rhubarb is a perhaps guilty pleasure may rest assured that the sight of a vexed umpire running a manager out of a game will remain with us, probably as long as there are still humans doing the umpiring.
But that is an argument for another day.
Coda: An Appeal for Crowdsourcing
Earlier, I explained how gaps in the game rundowns at Retrosheet cost me some of my data. However, I was not helpless against the blank spaces.
Given a manager on the right team, or with the right opponent, I could sift through on-line newspaper archives and pin down some of the ejections for myself. Through my local library, I have access to back issues of The New York Times, covering the Yankees and Mets. Through the Google News Archive, introduced to me by friend of the site Paul Golba, I found two Milwaukee newspapers (later merged into one) that let me chase down Brewers games.
Using this narrow set of resources, I was able to get satisfactory answers for over two dozen “missing” ejections, and narrow down a few others to an inning or half-inning. I have since sent those details along to David Vincent at Retrosheet, who has recently been running a volunteer research effort on ejections. Any future researcher pursuing this admittedly arcane line will find fewer holes in the record at Retrosheet.
But it would be nice to do better, and that’s where you can come in.
Retrosheet is a volunteer organization. Much of the deep pool of data it offers comes from the time and brainpower of ordinary people chipping in to help uncover and record it all. I probably have a lot of such people to thank for what completeness in the ejection record there is, and throwing my little weight onto the scale just starts to balance the books.
The more we can do to fill out the record, the fewer gaps will trouble future (and present) researchers. For readers of The Hardball Times with the resources at their fingertips, such as access to the back issues of a big-league city’s newspaper, the opportunity is obvious.
I make this appeal with an eye to the ejection records—my first lead to potential researchers is that the 1990s, along with the first couple years of the 2000s, are surprisingly patchy on manager ejections compared to surrounding periods—but there is plenty of other work to do. I won’t presume to push anybody into the area of ejections, especially since this series is done and I won’t be profiting directly from it. If you can help Retrosheet with anything, it’s worthwhile.
Bill James defined sabermetrics as the search for objective knowledge about baseball. That begins with full and accurate information, and every little bit helps.
References and Resources
Despite the nitpicking I had to do, Retrosheet was utterly indispensable for this work, as were the Win Expectancy figures at Baseball-Reference. In fact, whenever you see me doing a statistical piece here at THT, you can take as given that I’ve used both.
Weaver on Strategy, by Weaver and Terry Pluto, is an excellent look into the mind of one of baseball’s best managers, and it may be inspiring me more in times to come. You already know what I think of Chris Jaffe’s Evaluating Baseball’s Managers, but I acknowledge it here one more time.