In my last article, I looked at how players regress to the mean, and how players on the borderline between the minors and majors might not have a readily identifiable mean to regress to.

But what if we add in AA players?

It seems that adding AA players does not give us a trimodal distribution; the means for the two leagues (or at least their major league equivelencies) are pretty close to each other. It’s possible that this is a selection bias from the way the MLEs are computed, of course. But this also squares pretty well with what we think we know about the difference in league quality, so even if there is such a bias I’m not sure it’s large enough to give us a truly bimodal distribution between AA and AAA players.

But while that shifts our second mode over to the left a bit, it also gives us a much larger population of players in the minors than the majors. Here’s a helpful illustration, at about 110 PAs:

I almost wonder if there’s something I’m missing here, though, with my assumptions – if pressed I would guess that in real life the right-side part of the curve on the two distributions line up a lot better than what I’m showing here. There are estimated standard deviations, and so maybe the observed SDs for minor league talent are larger than what I’m showing. I’ll have to check into that.

The Rabbit said...

What does the curve look like for AA only vs. MLE? If it’s here somewhere, I’ve missed it.

A THT report earlier this week cited the differences in age at the AAA vs AA levels.

The variation was the greatest (not surprisingly) for contending teams who use AAA as an expanded ML roster for injuries, meltdowns, etc. and are, therefore, populated with older, more experienced players. These players may actually be in the majors if signed to another team.

I would expect a combination of AA/AAA to be slightly skewed given the “philosophical” differences in the purpose of the AAA teams.

Colin Wyers said...

You want to see the AA and AAA teams on seperate curves?

Frankly, while I can (and have) graphed that, its really not interesting – I’d say somewhere between 85 and 95 percent of the two cuves overlap. That’s why I feel comfortable graphing that the way I did – we really don’t need to know whether or not a guy is in AA or AAA to regress him to the proper mean, so long as what we are looking at are his translated stats, rather than his actual stats.

Now, of course this relies on the assumption that the DTs are capturing the correct translated means for each league – if the difference in league quality is substantially larger than what the DTs are saying, than this is wrong.

The Rabbit said...

Thanks for your response.

Nope, didn’t particularly want to see the curves, but assumed you must have done it.

I’m always curious about the nature of the underlying data…It’s a curse from my career (thankfully, retired) in financial analysis.