The manager bump

by Adam Guttridge
June 26, 2009

Jim Tracy is off to an incredible start as manager of the Rockies. As of this writing the Rockies are 19-7 since Tracy took the helm, including a streak of winning 18 games out of 19. Quite a contrast from Clint Hurdle’s 18-28 start to the season. Now, being a lifelong Rockies fan (despite growing up in Florida and never seeing Colorado before that love was a decade old) and even briefly a Rockies employee, I had seen this movie before. In fact, the recently deposed Hurdle was once the new savior. After taking over for a 6-16 Buddy Bell, Hurdle led the Rockies to a 21-9 record in his first 30 games at the helm in his first managing gig.

That’s what got me thinking … is this normal? Does getting rid of the old ball and chain rejuvenate, reinvigorate players? Does a new style of management help players relax? Does the desire to please your new boss make you reach back for a little something extra? Does clearing out old clubhouse tensions and biases create a new, more productive environment? And thus, are there real energies here that are exceptionally managed by the Leyland’s and Torre’s of the world? We can’t specifically tackle any of these questions, but we can approach them in the aggregate.

This has been studied before, probably by several people. In fact, David Gassko studied this question on these pages over a year ago. Gassko found that, overall, changing managers in the middle of a season generally has no impact, but the events in Colorado made me curious. So I studied it, too.

Starting with the year 2000, I looked at all in-season managerial firings. I then tracked the record of the new manager for his first 30 games, and compared it to the record of the previous manager in the same season (regardless of time; whether they managed 25 or 105 games before being fired).

There are a couple of exceptions. Mike Hargrove left the Mariners late in the ’07 season for personal reasons, which I regard as outside the polemic here. Also, the Royals twice fired a manager, then had maybe a 2-week interim manager, then hired a permanent manager. The “who’s the boss” situation there muddles things a bit too much for my liking, so I left those out. In all, I have 22 “firing seasons” worth of data. Here are the results:

                  W    L     %
Old Manager      593  837  .415
New Manager      297  352  .458

Wow. What’s going on here? The new guy steps in and starts winning at a much higher clip! Was Gassko wrong?

Well, hold the horses just a moment. For starters, although it seems like a huge difference in the baseball world, we’re talking about winning 4.3% more often; an extra 1.3 wins in those first 30 contests. (Which is one thing I’ve always found fascinating about baseball; the cellar dwellers win 45% of their games).

Also, the 800-pound gorilla in the room is selection bias. Firings aren’t random; they don’t happen when things are going well or up to par, they happen when things are going poorly. But isn’t that just a further indictment of the previous manager, and a confirmation that a new manager has made things go … less poorly?

Not exactly. As those in the statistical community are well aware, random fluctuation can take things all over the map. You can attribute certain things to skill, and certain things to luck, and you will find plenty of arguments along that edge.

Bear with this example for a moment; if you draw a line through the top half and bottom half of hitters in a given season (using whatever statistic(s) you’d like to make that determination), the upper half has been collectively lucky over the course of that season (and the bottom unlucky), even though offensive production is predominately skill-driven. If you use batting average as that barometer, and the median turns out to be .265, then the whole of those above that threshold have been lucky in the aggregate.

Sure, there were .315 hitters who only hit .300 due to bad luck. But there are also a lot of .259 hitters who hit .271, and .271 hitters who hit .259. If the sample size (at-bats) is infinite, and thus randomness thrown out of the equation, they would have eventually returned to their baselines and luck would be equal for the top and bottom. But in the aggregate, if .265 is truly the median of the population, then there were more lucky hitters in the top half than the bottom half, and more unlucky hitters in the bottom than the top.

Well, managers from that top half don’t get fired. They don’t get fired when luck has recently been on their side; only when it has not been. And it is not at all unusual for teams to have positively massive swings in their performance, even over relatively large spans. This was the essence of Gassko’s analysis.

For example, here are 6 recent teams with seasonal records equal to the “new manager” winning clip of .458 (about a 74-88 record … and, so as not to distort things, none of these are from firing seasons). Since the average period that the ‘old managers’ still held the wheel was 65 games, I calculated the highest and lowest winning percentage of each 65-game stretch in that season. The amount of fluctuation was even greater than I anticipated.

Team        Overall    Worst 65   Best 65
05 Giants    75-87        .371      .532
03 Pirates   75-87        .381      .524
08 Reds      74-88        .397      .540
07 Nationals 73-89        .413      .508
08 Tigers    74-88        .371      .597
07 Astros    73-89        .365      .492

            Mean          .385      .532

The ’08 Tigers played .371 ball for 65 games in ’08, and also .597 ball for 65 games. So, for 40% of the season they were the ’08 Nationals, and for a different 40% of the season they were the ’08 Rays. Incredible. And if you assume that managers are most likely to be fire when their teams are in the midst of a poor 65-game streak, you can see that a team’s improved performance might improve after he was fired not because he was fired, but just because.

Does that mean there’s nothing to managers’ ability to maximize performance via ‘chemistry’ functions? I mean, the sour lemons were replaced by handpicked fresh faces specifically for that purpose. If there was a place it would show up, it would be here, no?

Well I can’t exactly say that. To paraphrase the well-known professional skeptic James Randi, in his dealings with the likes of Uri Geller and Slyvia Browne; I can’t prove that these abilities do not exist, but I can’t find any hard evidence that they do, at least to any meaningful degree.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG