Very Brief Background
Last year I wrote some articles for Baseball Think Factory attempting to determine who were the best and worst managers of all time. The articles piggybacked on a terrific database put together for a SABR presentation in Toronto by Phil Birnbaum. In an attempt to quantify “luck,” Phil developed a spreadsheet looking at five different ways teams can under/overachieve. First, Phil created an algorithm based on how hitters performed in their preceding two seasons and subsequent two campaigns to determine how one should’ve done in a given year based on runs created. Second, he created a similar algorithm for pitchers based on component ERA.
Third, he compared a team’s actual runs scored versus how many runs they should’ve scored based on a runs created estimate using their actual team-wide offensive stats. Fourth, he did the same for runs allowed using Bill James’s Component ERA. Finally, he noted the difference between a team’s actual win-loss record and their Pythagoras record.
However, while Phil initially called it his “Luck Database,” I hypothesized, and Phil agrees, luck was not the only factor involved, and that managers had an impact. In the first two components, their ability to coach and manage their players would cause the athletes to improve or worsen. Their in-game tactical decisions impacted the second pair of components. Though I wasn’t sure if managers affected their teams’ Pythagoras win-loss marks, I did believe it was possible. As research by Dave Studeman has shown, a team with a consistent offense should exceed its Pythagoras record. Managers should have some say on their team’s consistency.
Well, I ran the numbers, and sure enough, the Birnbaum database provided a good statistical record for managerial ability from 1896 to 2001. (Birnbaum built his database from another source that only had stats through 2003. Since one would need 2004 stats to see how players under/overachieved in 2002, the database ends at that point. Admittedly, the mathematical proof I use won’t pass the most rigorously stringent standards for quantitative brilliance, but they pass the smell test. There’s a loose end from those articles I’d like to address—what the Birnbaum database says about managerial aging patterns.
The notion that a manager’s ability rises and falls with age shouldn’t be controversial. I don’t think anyone believes Connie Mack was at the top of his game in his last decade on the job, and everyone concedes that Paul Richards’ sad return to managing (where he distinguished himself by falling asleep on the bench) showed the old Wizard of Waxahachie had slipped. Though not usually that extreme, managerial talent is not constant across time. A manager has to be old enough to get players’ respect but still young enough to relate to his charges and have the energy to last for six months on the job.
Problems and False Starts
The Birnbaum database tells me how teams under or overachieved in five different areas. All I have to do is put in the rest of the managers, put in the ages for all of them, and—voila!—I can present Definitive Information on Managerial Aging Patterns. Would it were that easy. Two problems crop up: selection sample bias, and changes in managerial aging patterns over the generations. By selection sample bias, I mean the ability from the types of men hired to manage at a certain point can warp the data. Let me use an example. Think for a second who you think is the worst manager in baseball history. Maury Wills is a popular choice for this title.
Wills was 47 when hired. Most of the worst-ever managers are going to be around that age. They’ll rarely be the first ever hired. Many times the youngest managers turn out to be the best. John McGraw was a young manager once just as was Tony La Russa was in the 1970s and Ozzie Guillen is now. Alternately, the worst-ever managers will never be among the profession’s graybeards because they’ve been weeded out. Hardly anyone in their 60s with no managerial experience gets hired and no one will hire a 63-year-old Terry Bevington.
The further you get from the average managerial ages, the higher the overall level of raw capability should rise. In many ways games managed at a given age is the best indicator of managerial ability. That tells you what the market thinks of managerial aging, and contrary to popular opinion around these parts, I do think most baseball executives have a good handle on their jobs. In theory, aging patterns should mirror games managed.
Complicating the selection sample problem is a second concern—managerial aging patterns have changed over the years. As the game has gotten older, so have managers. An average manager nowadays would be one of the older ones for most of baseball history. Hell, do you realize John McGraw won over 1,100 games before turning 40? Player-managers used to be very common, and even the non-playing leaders were younger than they currently are. Connie Mack was younger when he won his first pennant than Ozzie Guillen was in 2005. And that came almost a decade after the Tall Tactician first filled out a lineup card.
It would make sense that managers would have gotten older because the nation’s life expectancy has grown so much. When McGraw and Mack started, the average American lived to the age of 55. By that standard, Dusty Baker would have died in mid-2004. (Insert your own punchline, Cubs fans). Using the full Birnbaum database then complicates matters by lumping together different eras of managers who rise and fall at different times. The results from the
full century of data come back scattershot as it claims men in their early 30s make the best managers. There haven’t been any of them in a half-century.
I can solve this dilemma by narrowing the years looked at. By the Kennedy administration player-managers were a novelty as gray hairs dominated the profession. Thus to minimize the problems, those are the years on which I’ll focus.
Fun Stuff: The Results, 1960-2001
In terms of games the big years are ages 44-51. Over 40% of all games managed come from those years. If there’s anything to this aging pattern study, those ages should be good years for managers.
Plugging it in, the information isn’t as clear as I would’ve hoped. For a given component managers will score fine a few years in a row, fall completely apart, then rise back up just as suddenly. At times things will veer back and forth. Generally there are patterns and evident trends, but with seasons thrown in that make no damn sense. It reminds me of a criticism I’ve seen of park factors where one year’s data isn’t seen as nearly enough of a sample size. I went to an elongated version in which I combined five years’ worth of data for the managers for each component. That smooths things out considerably, but I don’t know if that’s the right way of doing it.
The five-year sample could just be me trying to fit a round peg into a square hole. Ultimately I’ll focus on the single-year info with occasional reference to the larger bunchings. So much for the qualifiers—now for the main event. According to the Birnbaum database, managers peak at—drum roll, please—age 48.
Their teams exceeded expectations by +988.72 runs (which you can see below if you scroll down to the end of the article). It comes in the middle of the stretch when men are most likely to manage. It’s only competition for optimal year is age 52 (+903.08) as there’s only two other years over 500 runs. Like I said, these zigs and zags from year to year are a bit much. The worst? Age 55 at –1116.07, which is nearly twice as bad as the runner up.
That zag has zags on its zig. In general, there are some clear trends. Up through age 40, managers score negatively every year except one. From ages 41 to 52, managers score positively three-fourths of the time and never have back-to-back negative years. Altogether, they are +3505 runs in 100,000 games; about a half-game better than average. Nothing extraordinary, but then again those dozen years account for well over half the games managed in this sample size. Mighty tough to be substantially better than average under that circumstance when you’re most of the average. Then managers fall below average in six of the next seven years from 53 to 59. They’re barely above average for three straight years in the early 60s and terrible after that.
Going by five-year stretches, managers are good in every stretch from ages 39-43 to 50-54 and are almost never good outside it. Managers are at their best when they’re most likely to be hired. Not particularly shocking, but nice to know. Of course, one can easily make too much of this. According to this
information, the New York Yankees made an absolutely abysmal hiring all those years ago when they plucked a 58-year-old Casey Stengel to manage their club. That worked pretty well for them. Managerial aging patterns aren’t nearly as rigid as player aging ones because the mind doesn’t gain or lose abilities as regularly as the body. That’s the broad picture; how about the individual section? Well . . .
Compared to the overall results, managers peak a little later. They never have back-to-back positive seasons throughout their 40s. From ages 39 to 49 they are—905.27 runs. The worst single season comes at age 42, when they’re –388 runs. Things start to improve in their late 40s. After scoring +70.70 at age 49, they barely score negatively with a –15 at age 50. They then rattle off three consecutive positive seasons, including their best two at ages 51 (+408) and 53 (+400). That begins a stretch of 8 positive years out of a dozen. From 51 to 62 they are +1080 runs. Managers peak in their 50s with individual hitters. I have no idea why, but according to the Birnbaum database they hit their strides late here.
This one’s the most erratic component of all. They follow their best year with pitchers—an exceptionally high +753 at age 52—with a 1,000-point drop. Elsewhere, their worst year (-459 at age 44) comes immediately after their third-best age. Overall, as was the case with individual hitters, they get off to a slow start. They fair poorly four out of five times from 35 to 39. They alternate low and high seasons throughout their 40s, but the lows are more extreme than the highs. In their mid-to-late 40s things begin to change. After ages 44-45 they stop having back-to-back negative seasons. Their cumulative score drifts upward after reaching its ultimate nadir of –1006 runs after the age-47 season. From ages 48 to 57 they are positive seven times while amassing over +1100 runs. They then falter and only have two positive seasons after that. Managers peak at ages 48-57 with individual pitchers. Going by the Birnbaum database, managers do a better job coaching as they age.
This component’s more consistent than the coaching ones. They have four straight negative years early on, but beginning at age 38 they score positively eight times in nine years. They’re +920 runs in that period. Remarkably, their four best ages are in the same five-year stretch—ages 42, 44, 46 and 43. Then, in a sudden zag, they have three consecutive negative seasons. Skippers score weakly in their 50s. Twice they have back-to-back positive seasons in that decade, but both times are reliant on a year that’s less than +10 runs. In general, scores are more subdued here than in the coaching areas.
Their worst season (age 49) is only –199.80, while their best season is under +250. It makes sense their aging scores for the strategy element would be less extreme because their overall scores from the previous articles were less striking as well. Managers are at their best running the offense from ages 38-46.
Managers don’t have as youthful a peak as with the team offense, but still younger than in the coaching areas. From ages 33 to 42, they are below average every year but one for a cumulative –567 runs in that period. Again, it’s less drastic than with coaching. Then managers score positively 11 times in the ensuing 15 years. Two of those negative years (ages 46-47) are stuck between the two best seasons of all; +344 at age 45 and +357 at age 48. Altogether they’re +1326 runs for that era. They then drastically fall off a log in their late 50s, scoring negatively 10 straight years (Ouch!). Managers do their best handling the staff from in their late 40s and early 50s.
Going by this database, they initially are better at handling the strategic elements, and gain on the soft-skills side of the job as their tactical abilities wane.
I’ve never been certain if managers actually have an impact on Pythagoras deviations in this study. The numbers say so, and I’ve gone along with it, but I wasn’t fully sure of it. However, the aging patterns provide good evidence that managers do have some impact. Though there’s still the occassional zig, the aging patterns for this are the most consistent of any of the five components. I sure as hell didn’t expect that. Managers start poorly, scoring negatively four of the first five years.
Then, from ages 39 to 49, they’re positive all but two times. Also, managers score higher in their best seasons than they ever do in the coaching areas. Their third-best Pythagoras season, +372 at age 49, beats any year for either of the previous two components. In this 11-year stretch, they’re almost +2300 runs overall. They follow this up with nine consecutive negative seasons that add up to –2126 runs. Conveniently, the two worst seasons are back-to-back (-405 at age 53 and –429 at 54). It zig-zags a little after that stretch, but not much. In fact, they never have successive positive seasons until their late 60s when the sample size becomes miniscule. Managers peak in this regard in their 40s and decline considerably afterwards. It’s the single most striking aging pattern of them all. I’ll be damned.
Here’s the actual numbers for those of you interested in seeing them for yourself. Enjoy:
Age IndH IndP TeamH TeamP Pyth All G 33 -21.4 16.5 7.9 -27.1 -14.1 -38.3 161 34 5.0 19.5 -61.9 -32.6 -82.7 -152.8 762 35 -58.1 -256.8 -86.7 -139.7 -38.7 -579.9 1236 36 153.6 -56.0 -56.2 -108.3 171.8 105.0 2201 37 240.3 -192.5 -12.5 -0.7 -79.0 -44.3 2481 38 -121.4 54.5 98.0 59.9 -367.5 -276.4 3907 39 -115.9 -181.8 155.1 -147.8 230.4 -60.0 4428 40 -76.5 26.2 11.6 -48.7 30.4 -57.0 4274 41 212.5 159.0 -149.8 -8.5 -192.0 21.2 5603 42 -388.1 -17.0 229.6 -114.0 555.3 265.8 6592 43 37.6 226.8 155.9 265.8 93.2 779.3 8174 44 -82.8 -459.4 213.5 9.9 -132.9 -451.7 9026 45 -27.4 -272.2 33.1 344.2 308.4 386.0 9262 46 -173.2 102.6 173.1 -47.9 227.3 281.9 9386 47 89.2 -175.5 -98.9 -176.5 169.8 -191.8 8738 48 -259.3 548.4 -115.1 178.7 636.1 988.7 8122 49 70.7 61.7 -199.8 357.7 371.6 661.9 9202 50 -15.7 -77.0 51.5 -57.9 -238.2 -337.3 9012 51 408.1 63.9 -70.9 85.3 -288.3 198.0 9148 52 137.4 753.6 -25.9 160.7 -122.8 903.1 8087 53 399.9 -255.8 5.5 174.0 -404.9 -81.4 7648 54 -342.6 199.9 74.8 8.8 -429.1 -488.3 6529 55 -125.6 -476.0 -158.2 -130.9 -225.4 -1116.0 5283 56 110.4 109.7 -128.0 103.9 -233.6 -37.7 4117 57 356.4 183.4 -29.5 50.2 -103.4 457.1 3822 58 -44.1 -89.9 59.6 -136.8 -80.9 -292.1 3434 59 -89.5 -58.9 7.3 -82.9 110.5 -113.5 2739 60 142.3 141.4 -17.6 -119.1 -96.8 50.3 2244 61 40.9 -20.3 44.0 -31.0 -30.1 3.5 1898 62 87.2 -84.3 39.8 -60.2 79.5 62.0 1253 63 -195.8 221.9 -31.8 -86.9 -53.5 -146.1 1126 64 -130.3 -58.8 -50.3 -12.2 -93.6 -345.2 1442 65 -96.2 -77.5 -65.7 -33.9 42.5 -230.7 909 66 1.2 -1.6 56.8 -7.4 -40.1 9.0 568 67 8.2 -47.1 -19.8 -92.9 16.9 -134.7 679 68 -33.3 105.7 2.3 11.8 18.3 104.7 292 69 38.4 39.3 -27.5 36.1 63.2 149.5 316
References & Resources
By far the most important source for this article was Phil Birnbaum’s “luck” database. A description of it can be found at his website. Click on the powerpoint display for his SABR 35 presentation in Toronto, “Were the 1994 Expos Just Lucky?”
Dave Studenman’s column Runs Per Game helped me understand how managers could have an impact on Pythagoras differentials.
My own previous articles that serve as the basis for this article can be found at the Primate Studies section of Baseball Think Factory: Part One and Part Two. Though it’s based on a different source, those interested in a study of contemporary managers may wish to look at Part Three as well.