In a recent Freakonomics blog post, Steven Levitt wondered whether “our sabermetric friends have done any research on whether firing a manager mid-season helps a baseball team?” Since Dr. Levitt is a mini-hero of mine, and since I suppose that in turn I might count myself as one of those “sabermetric friends” (though I have never personally corresponded with him), I thought I’d look into the question.
The simple answer would be to look at every team with exactly two managers in a season and see whether or not they improve in the second manager’s reign. There have been 326 such teams in baseball history, and they have averaged 72.7 wins per 162 games under their first manager and 75.8 under the second.
Problem solved, time to go home: Firing your manager improves your record. Okay, not quite. The problem, as Dr. Levitt points out, is that “the only teams that will fire their managers are those that have been performing worse than expected; as such, they might improve simply because of mean reversion.”
Indeed, statistics tell us that a team’s record is part talent and part luck and that bad teams are generally bad because they are in part unlucky, so that if you choose any number of below-.500 teams at any point during the season, you’ll find that they perform closer to .500 the rest of the way. Statistically, this is known as “regression to the mean.”
So what can we do to combat that issue? Well, we can add a certain number of games of average performance to a team’s record with its first manager to estimate its true talent by regressing to the mean. The more information (that is, games played) we have, the less the team’s record gets regressed.
But how much do we regress? We can use something called the binomial distribution to find out. The binomial distribution tells us what kind of spread we expect in a statistic when there are only two types of outcomes, say wins and losses.
So in this case, given that the average season in baseball history has lasted 135 games, we can estimate that the standard deviation in winning percentage due to random chance would be .043—in other words, if baseball games were decided only by luck, two-thirds of all teams would have a winning percentage between .457 and .543. Instead, one standard deviation is .088, meaning that there is a lot more than luck involved in deciding the outcome of baseball games.
For the variance (which is the standard deviation squared) due to luck to equal half of the overall variance, baseball teams would need to play just 33 games a year. And therefore, we need to add 33 games worth of average play to each team’s record to regress it to the mean. When we do that, the average team with two managers in a season averages 75.4 wins per 162 games before firing the first, just .4 fewer than it does after. Seems like a pretty small effect.
But can we be sure that we are regressing to the mean correctly? Tom Tango, for example, has suggested regressing to the mean by adding 69 games of average performance to a team’s record. If we do that, we find that the average team wins 76.7 games per 162 before firing its manager, .9 better than it does after.
The problem is that the spread in talent varies markedly from year-to-year. In 1872, one standard deviation in “true” winning percentage (that is, taking the randomness out of it) was .278; in 2007, it was .042. Therefore, we need to regress 2007 teams to the mean a lot more than teams from 1872; to be exact, we add on 145 games of average performance to the records of 2007 teams, but just three to 1872 teams.
So let’s repeat that process for every year and re-calculate the regressed records. What happens? We find that, regressed, teams average 76.1 wins per 162 games before firing their managers, versus 75.8 after. That difference is not significant.
In other words, teams don’t seem to benefit at all from hiring a new manager mid-season; you can count me unsurprised.