In my last article, I used the “delta method” to analyze the relative performance of ballplayers as they age. Let’s address a major issue in this approach and attempt to correct for it. The major issue in question is known as…
As many of you know, and JC points out in one of his responses, the “delta method” suffers from something called “survivor bias.” What is “survivor bias” and how does it affect the aging curve and peak age of offensive performance? At every age, there is a class of players who play so badly in any one season that they don’t play at all (or very little, which I call “partial survivor bias”) the following year. (Obviously there are also players who don’t play poorly in Year I but also don’t play in Year II.)
These players tend to be unlucky (and usually untalented of course, at least at that point in their careers) in Year I. Therefore the rest of the players who do play in Year I and the following year tend to have been lucky in Year I. This is survivor bias. Any player who “survives” to play the following year, especially if he racks up lots of plate appearances, whether they are good, bad, or indifferent players, true-talent-wise, will tend to have gotten lucky in Year I. In Year II, they will revert to their true talent level and will thus show a “false decline” from the one year to the next.
Obviously players who do not survive from one year to the next are not in our sample. If they were (if they were allowed to play the next year), those players would tend to show a “false” improvement and thus would “balance out” those players showing a false decline.
If you are confused, here is simplification of the process, which will help you understand the principle: Say there are 100 marginal players in MLB at any one age, and their true talent batting average is .220. Let’s say that they get 200 at-bats in Year I and half of them end up with an average of .180, and the other half, .260. Since in 200 PA, their “sample” BA will be all over the place (the standard deviation of batting average in 200 AB is 29 points), but centered on .220, this is a plausible, albeit simplified, scenario. And let’s say that only those 50 players who hit .260 were allowed to play the next year. After all, if a marginal (or old) player, talent-wise, has a very unlucky season, he is often benched or retires the following year.
So now we have 50 players remaining who hit .260 in Year I and are allowed to play and amass another 200 AB in Year II. What will they hit, on the average, in Year II, if they neither improved nor declined in true talent (say they were around age 27)? .220. After all, we already said that they were true talent .220 hitters. But what would the delta method say about them? It would say that they declined by 40 points (.260 in Year I and .220 in Year II)!
If all 100 players were allowed to play the next year, and they all hit .220, as they should, then half would decline by 40 points, and half would improve by 40 points (.180 in Year I and .220 in Year II), and the net change for all 100 players according to the delta method would be zero, as it should be. That is why survivor bias produces more decline (and less improvement) than it should at every age interval.
Also keep in mind that survivor bias is merely a subset of “partial survivor bias,” whereby some players who are unlucky in one year get limited playing time the following year. When we weight by the lesser of the two PA or the average of the two PA, partial survivor bias will also create a “false decline” for all age groups. In the above example, if the true .220 batters who hit .180 in Year I were allowed to get 50 PA in Year II while the remaining batters were allowed to rack up 200 PA in Year II, we would have the following average “delta,” if we used the average of the two PA for the weighting:
((-40 x 200) + (40 x 125)/ 325)
That is a nine point “false” decline.
Any kind of correlation between performance and future playing time, either within a season or from season to season, is going to cause problems with our results when using the delta method to compute aging curves. The most dangerous and prevalent kind of bias is partial or full survivor bias, which causes a false decline at all age intervals.
How large is the class of players who do not survive from one season to another? From age 20 to 24, for every hundred players who play back-to-back seasons (and are included in our sample), there are seven more who play for one year and don’t come back at all the following season (and are not in our sample). From age 25 to 28, it is 11 players who do not survive for every 100 who do. From age 28 to 35, it is 16/100 and after age 25, it is 30/100! Almost 25 percent of all players after the age of 35 have played their last year (or at least don’t play the very next year) in MLB.
How bad is their performance in Year I, such that they are not permitted to play the following year? From age 20 to 24, these non-survivors average -34 runs per 500 PA in Year I. It is no wonder that there is no Year II for these players. Of course if there were a Year II, they would likely hit quite a bit better than -34, due to regression towards the mean. A player who is allowed to get at least a few MLB PA is likely not a true -34 hitter.
At age 32-35, players averaged around -18 runs in their likely final years. Obviously these were probably very good players in their prime and in their careers (otherwise they would not have made it that far). Again, if they were allowed to play one more year, even at such an advanced age, on the average (not everyone) they probably would have played a little better in Year II than in Year I (although probably not so much as with the younger players).
Keep in mind that it is not the absence of these players that is causing a problem. It is the fact that the remaining players—all the players in our sample—tend to be (on the average) slightly lucky in Year I and thus will show a false decline in Year II, over and above the real aging-related increase or decrease from one year to the next.
Here are some data on all players who played at age X but not at age X+1. For each interval, for example, 25/26, the player played at the first age but not at the second.
Table IV: Players who played in Year I but not in Year II (1950 to 2008)
Age Year Yr 1 Lwts Career Last 3 Career Last 3 Yrs Couplet Plyrs 1 PA /500 PA PA YRs PA Lwts/500 Lwts/500 20/21 19 45 -32 46 46 -32 -32 21/22 24 14 -42 26 26 -34 -34 22/23 53 43 -29 76 76 -27 -27 23/24 90 49 -37 84 84 -32 -33 24/25 178 41 -25 94 88 -25 -25 25/26 231 55 -28 130 117 -23 -23 26/27 321 52 -26 183 144 -20 -21 27/28 324 63 -26 315 208 -18 -20 28/29 339 62 -25 405 235 -15 -17 29/30 320 71 -23 637 259 -12 -16 30/31 283 80 -20 961 360 -11 -14 31/32 274 77 -25 1351 373 -10 -15 32/33 251 92 -19 1724 433 -8 -13 33/34 246 88 -20 2153 449 -8 -12 34/35 232 102 -18 2930 505 -5 -11 35/36 210 106 -14 3323 508 -1 -8 36/37 150 107 -17 4077 553 -1 -7 37/38 127 119 -12 4597 586 3 -4 38/39 97 148 -13 5398 664 2 -4 39/40 69 128 -11 5329 594 3 -5
As you can see, in almost every case, even for the younger players, their last year (Year I) performance is worse than the average of their last three years (including Year I). Presumably if they were to play the following year, their numbers would improve, after regressing their last three years toward some mean (sort of a rough Marcel projection).
So how can we account for the fact that these “unlucky” players would have been in our sample save for the fact they weren’t allowed (or able) to play the following year? We can assume that they got some playing time in Year II, pencil in a conservative projection for them, and then re-do (with these players now included in our sample) one of our delta methods.
I say a conservative projection because the fact that they were not allowed to play in Year II, for whatever reason, implies that they were not very good players, true talent-wise, and we certainly don’t want to regress their career or last three-year numbers toward that of a league average player. Clearly these players do not belong to the population of the typical MLB player, per se (even though technically they are MLB players of course).
Here is the same chart as above. This time I added a column, which is the average Year II projection for the pool of players at each age group. The projection is their last three years lwts per 500 PA, weighted by year (3/4/5) added to 500 PA of league average lwts for that age minus five. In other words, I am regressing their last three years lwts (weighted) toward five runs lower than a league average player for that age.
That “minus five runs” is the downgrade I used to generate a conservative projection. The final step in the projection is to add in an age adjustment. I realize that we don’t know the exact adjustment until we re-run the data, but it is not that critical to get it right.
Table V: Players who played in Year I but not in Year II, including their projection for Year II
Age Year Yr 1 Lwts Career Last 3 Career Last 3 Yrs Yr II Couplet Plyrs 1 PA /500 PA PA YRs PA Lwts/500 Lwts/500 Projection 20/21 19 45 -32 46 46 -32 -32 -20 21/22 24 14 -42 26 26 -34 -34 -13 22/23 53 43 -29 76 76 -27 -27 -14 23/24 90 49 -37 84 84 -32 -33 -14 24/25 178 41 -25 94 88 -25 -25 -10 25/26 231 55 -28 130 117 -23 -23 -15 26/27 321 52 -26 183 144 -20 -21 -13 27/28 324 63 -26 315 208 -18 -20 -12 28/29 339 62 -25 405 235 -15 -17 -13 29/30 320 71 -23 637 259 -12 -16 -13 30/31 283 80 -20 961 360 -11 -14 -15 31/32 274 77 -25 1351 373 -10 -15 -16 32/33 251 92 -19 1724 433 -8 -13 -13 33/34 246 88 -20 2153 449 -8 -12 -15 34/35 232 102 -18 2930 505 -5 -11 -14 35/36 210 106 -14 3323 508 -1 -8 -12 36/37 150 107 -17 4077 553 -1 -7 -12 37/38 127 119 -12 4597 586 3 -4 -8 38/39 97 148 -13 5398 664 2 -4 -12 39/40 69 128 -11 5329 594 3 -5 -13
Even though the projections are in all cases better than their last year (Year I) of performance, they are still close to replacement level (with an exception at 37/38 for some reason), so you can see why MLB teams were not too enthusiastic about allowing them to play (at the major league level) anymore.
So now that we have taken care of our survivor bias problem, all we have to do is redo our aging curve by using one of our delta methods. Before we look at those numbers, you may be wondering, “What am I going to use for the number of PA in Year II for those players who didn’t actually play in Year II?” I am going to use their Year I PA, though not for any compelling reason. It generally is going to be a small number, as you can see from column 3 (the average number of PA in Year I) in the above chart.
Table VI: Aging data, including non-survivors (1950 to 2008)
Average Age Change Cumulative Couplet Players in LW Difference 20/21 161 -4.1 -32.3 21/22 390 14.2 -18.1 22/23 780 5.3 -12.8 23/24 1314 6.1 -6.7 24/25 1954 2.8 -3.9 25/26 2335 2.1 -1.1 26/27 2461 1.1 0.0 27/28 2412 0.0 0.0 28/29 2292 -1.9 -1.9 29/30 2095 -1.3 -3.2 30/31 1866 -1.8 -5.0 31/32 1631 -2.3 -7.3 32/33 1393 -2.3 -9.6 33/34 1169 -3.8 -13.4 34/35 965 -4.1 -17.5 35/36 757 -4.4 -21.9 36/37 540 -4.7 -26.6 37/38 389 -3.9 -30.5 38/39 268 -6.8 -37.3 39/40 170 -4.9 -42.2
In the graph below, you can see the difference between the aging curves when we do not include the non-survivors, and when we do; i.e., with and without a correction for survivor bias.
Chart III: Aging curve with and without non-survivors (1950-2008)
Once we account for survivor bias by creating a “phantom” Year II for the non-survivors in each of the age intervals, the average aging curve shifts slightly to the right and the decline after the peak is a little flatter. There isn’t a large difference between the two aging curves, however.
Comparing eras after correcting for survivor bias
Let’s compare again the pre- and post-1980 eras, after accounting for survivor bias. Without that adjustment, we found a slightly later peak and a much more gradual post-peak decline in the modern era.
Chart IV: Comparing the aging curves (with non-survivors) in two eras (1950-1979 and 1980-2008)
We find the same thing when comparing the two eras after adjusting for survivor bias—the curve is shifted slightly to the right, and the post-peak decline is much less steep in the modern era. So it appears that advances in medicine, better training, higher salaries and perhaps PED use in the modern era change peak age only slightly, but significantly affect post-peak decline, keeping players in the major leagues far longer than in previous eras.
In the post-1980 era, a player at age 38 is expected to have lost 23 runs off his peak. Prior to that, players on average lost 46 runs off their peak by the time they reached the age of 38. Clearly, in the old days only superstars remained in baseball into their mid to late 30s and beyond.
Players with at least 10 years and 5,000 career PA
What if we do the exact same thing as above (use the delta method to construct an aging curve, and adjust for survivor bias), but we use only players who have played at least 10 years in the majors with at least 5,000 career PA (10/5,000), similar to the sample that JC used in his study? We might expect a later peak age and perhaps a more gradual post-peak decline.
Presumably, when we use the delta method with this kind of a sample (player with long careers), our results are more representative of the “average aging curve” for this type of player, as we are not “piecing together” careers of various lengths, as we are when we include all players. In other words, our results should look something like JC’s.
Table VII: Players who have played for at least 10 years with at least 5,000 career PA (1950-2008)
Average Age Change Cumulative Couplet Players in LW Difference 20/21 65 -0.8 -42.8 21/22 146 18.9 -23.9 22/23 247 7.1 -16.8 23/24 344 6.7 -10.1 24/25 424 3.1 -7.0 25/26 452 3.1 -3.9 26/27 465 3.1 -0.8 27/28 476 0.8 0.0 28/29 479 -0.6 -0.6 29/30 478 0.1 -0.5 30/31 478 -0.7 -1.2 31/32 472 -0.5 -1.7 32/33 459 -1.6 -3.3 33/34 439 -3.0 -6.3 34/35 411 -3.4 -9.7 35/36 367 -2.9 -12.5 36/37 306 -4.2 -16.7 37/38 236 -3.3 -20.0 38/39 172 -5.7 -25.7 39/40 116 -3.4 -29.1
Let’s compare the aging curve for these players to that of all players, again as determined by the delta method, weighted by the average of the two PA.
Chart V: Comparing the aging curves of all players and only those with at least 10 years and 5,000 PA (1950-2008)
Indeed, once we restrict our players to those with at least 10 years and 5,000 PA, as in JC’s sample, the aging curve looks quite different. While technically the peak age is 28, there is a plateau from 27 to 30. After that, there is a slight decline until age 32 or 33 and a gradual decline thereafter.
Comparing eras for players with 10 years and 5,000 PA
What if we split our sample again into the two eras—pre- and post 1980? We are looking only at players with at least 10 years and 5,000 PA.
Chart VI: Comparing the aging curves of two eras, for only those players with at least 10 years and 5,000 PA (1950-1979 and 1980-2008)
In the pre-1980 era, for players who have at least 10 years and 5,000 PA in MLB, the aging curve is pretty symmetrical around a plateau stretching from around 27 or 28 to 32. From 21 to 27 or 28 is almost a mirror image of 32 to 38. In the modern era, the player with a long and prosperous career peaks at 30 stays relatively stable until age 33, declines gradually (around two or three runs per year) after that until age 38, and then declines by around five runs per year after that.
Players with a minimum of 1,000 career PA
Finally, what if we reduce the requirements to 1,000 minimum career PA with no minimum number of seasons? We essentially are eliminating all those players who come up for a cup of coffee or two, or are career September call-ups or ultra part-timers only. JC mentions that when he does this, he comes up with the same results as with the more restrictive sample (10/5,000).
The following chart compares all players to those with 1,000 career PA or more to those with a minimum of 10 years and 5,000 career PA.
Chart VII: Comparing the aging curves of all players, those with at least 10 years and 5,000 PA, and those with 1,000 career PA or more (1950-2008)
The aging curve of the sample of players who have at least 1,000 career PA is almost exactly the same as that of all players. This is not too surprising, as even a part-time player needs only four or five years to accumulate 1,000 PA. Given the data above, I am skeptical of JC’s claim that he got the same results (using his “least squares” method) when he increased his sample size by requiring only 1,000 career PA (and no minimum number of years).
Of course, it is likely that as you increase your requirements and include only those players with longer and longer careers and more total PA, you will see the aging curves shift to the right and flatten out after their peaks. In fact, if I bump the requirement to at least five years and 2,000 PA, the curve moves slightly to the right (with a definite peak at 28) with a more gradual decline after age 31.
Summary and conclusions
JC Bradbury, a Ph.D. econometrician and college professor who has written several books on the economics of baseball and hosts a website/blog of the same name, recently published a scholarly paper in which he concluded that a certain sample of players peak offensively at age 29, and decline gradually after that—among other things. He appears to have generalized that conclusion, at least implicitly, to all players and not just those whose careers are similar to those in his research sample. He goes on to claim that when he reduced the requirements to only 1,000 career PA and no minimum number of seasons, his results were the same. I am skeptical of that claim based on the research that I have done using the “delta method” corrected for survivor bias.
The delta method is one way to construct an aging curve for offensive performance (JC’s uses another method—plotting each player’s career trajectory and then using a least squares/best fit model to come up with a composite trajectory), although it is debatable which way is best and which method answers what questions. The question, “What does the aging curve of the average MLB player look like and what is the peak age of that player?” is not an unambiguous question.
It is also not clear what weighting to use for the performance differentials when using the delta method. I have explained three methods—weighting everyone equally, using the “lesser of the two PA,” and using the “average of the two PA.” I think that the latter two are the best methods, and I prefer the last one, although there is very little difference between the two.
I have also explained the problem of survivor bias, an inherent defect in the delta method, which is that the pool of players who see the light of day at the end of a season (and live to play another day the following year) tend to have gotten lucky in Year I and will see a “false” drop in Year II even if their true talent were to remain the same. This survivor bias will tend to push down the overall peak age and magnify the decrease in performance (or mitigate the increase) at all age intervals.
One way to account for and overcome survivor bias is to imagine that these players actually played another year and that their performance in Year II was at their true talent level. We arrive at that true talent level by doing a basic Marcel-like projection. Since these players are truly marginal players (and/or at the end of their careers), when we do the regression toward the mean, we use a very conservative mean (five runs worse than we would ordinarily use).
When all is said and done, and we use the delta method and weight each interval by the average of the two PA, and we account for survivor bias, we pretty much find that which we already knew—players peak at around age 27-28, decline gradually until around age 35, and then decline a little more rapidly after that.
We also find that if we split the data up into two eras, pre- and post-1980, there is a slightly higher peak in the latter years (28, as opposed to 27-28 prior to 1980), similar declines until the mid to late 30s, and then a much shallower decline after that in the modern era. That should not be all that surprising given the advances in medical care, better training, higher salaries and perhaps the use of PEDs.
When we use the same methods, but only for players who played at least 10 years in the majors, we see, not surprisingly, significantly different trajectories. The aging curve is shifted to the right, there is a plateau from age 27-30, and a more gradual decline after that. My results are presumably not unlike those of JC. However, when I use the delta method (accounting for survivor bias) on a sample of players with a minimum of 1,000 career PA only, the resulting curve is not much different from that of all players—perhaps a slight shift to the right.
So while I think that we have shed some more light on the subject of peak age and trajectories for the average MLB player (as well as some subsets), I think the issue still contains a lot of muddy water. We don’t really know what the question is or even what it means once we articulate it.
Practically speaking, we are usually interested in the (estimated) peak age and trajectory for individual players and subsets of players in MLB, for projection and salary purposes. To that end, it is probably more useful to frame these questions in a much more specific fashion, such as, “Given that a player has already played five years full-time in the majors, and is a fast and wiry player, and given his trajectory thus far, what is his likely peak age, and what will his future trajectory look like?”
Those are the questions we really need to answer, and we probably shouldn’t concern ourselves so much with the nebulous concept of the “average MLB player’s aging trajectory”—whatever that even means.