THT Essentials:

Now available


You can now purchase the Hardball Times Baseball Annual 2013, with 300 pages of great content. It's also available on Amazon and Kindle. Read more about it here.


Follow our quick-hitting updates each day on Twitter.

And here's the full roster.

Most Recent Comments

Monthly Archives



Creative Commons License
All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.

Wednesday, June 15, 2011

Regressing or reverting?

Posted by Jonathan Halket at 5:27am

Here’s the punchline: “Regressing to the mean” is different than “Mean reversion”. Lots of experts say “regression” when they should say “reversion”. Does it matter? Well, lots of experts aren’t very clear about the numbers either, so, yeah, I’d worry that they’re messing up the whole thing.

“Regressing to the mean” is a handy thing for accounting for small sample sizes and uncertainty. If you’re just listening and not doing your own projections, you hardly need to bother with this concept (but I’ll get back to it at the end).

“Mean reversion” is an absolutely essential idea to grasp. One way to think about it is “Luck fades”.

Consider these questions about Michael Morse, who’s current batting average is .300:

A) Suppose you know his true talent indicates that he’s a .280 hitter. What would you expect him to bat for the rest of the year?

Answer: You expect him to bat .280. Morse has gotten lucky with his .300 average. By definition, luck is unpredictable (for those statisticians out there: higher order statistics are predictable). Therefore what you expect, .280, is what you expect.

B) Suppose you know his true talent indicates that he’s a .280 hitter. What average would you expect him to have at the end of the year?

Answer: Morse should probably finish with about a .290 average: half a season at .300 and half a season at .280.

What get’s my goat is that I often hear experts say something like “There’s no way Morse is a .300 hitter. He’s getting lucky with his BABIP. He’s gonna regress to the mean and be batting .280 at the end of the season.”

The meaningful problem with this statement is that I have no idea what this expert thinks Morse is going to bat from this point forward. Does he thing Morse will bat .280 from today until the end of the season? Or does he think Morse will bat about .260 (so that his average at the end of the season will be .280)? If you’re looking for expert opinions, there’s a big difference here.

The semantic mistake, using “regression to the mean” when he should be using “mean reversion” isn’t substantive. But combined the expert’s statistical ambiguity, it should raise serious red flags.

Lastly consider this question:

C) You have no idea what Michael Morse’s true talent is. But in about 1,000 plate appearances, he’s batted just above .290. What is a good guess for his true talent?

Answer: Now you’d use “regression to the mean,” which discounts Morse’s personal performance according to how much we’ve seen him play. With few plate appearances, our best guess would be that he’s something like “league average” and we’d care very little about he numbers he has personally put up. With lots of plate appearances, we’d have a lot of faith that his personal numbers were more indicative of his talent than a league average player’s numbers. If he’s somewhere in between on appearances, we’d weigh the two (his versus the league average) by regressing to the mean.

If you have a question for the Roster Doctor email here. Emails in simple text with players' full names properly spelled are much more likely to get responses. Also be sure to include your league's player pool (mixed, AL-only, NL-only), number of teams, scoring format (roto, head-to-head, points, etc.), categories, whether or not it's a keeper league, and any other pertinent information.


chuck said...

so, in other words, yuou just dont really know…

Posted 06/15  at  09:06 AM
Chicago Mark said...

My head is spinning!  Most “experts” would predict a .280 average ROS.  I really don’t read anywhere that would expect him to be at .280 year end.  Anbd I agree with Chuck above.  You and me and most others have little idea.  Otherwise we’d all have predicted .300 correctly.  I sure hope he regresses/reverts as I don’t have him in any league.

Posted 06/15  at  10:07 AM
Dave Studeman said...

Thank you, Jonathan.  Everytime I hear someone say that a player will “regress to the mean” I grind my teeth a bit.  You’ve expressed exactly why.

Regression is something you do in your analysis.  Performance reverts to expected levels.  Good distinction.

Posted 06/16  at  05:43 PM
Andrew said...

An important distinction, but you made a couple grammatical mistakes which detracted from your point. Specifically, you should have used ‘whose’ instead of ‘who’s’ and ‘gets’ rather than ‘get’s.’

Posted 06/17  at  08:14 AM
Page 1 of 1 Commenting is not available in this weblog entry.

     Next Post:  Hybrid league update>> <<Previous Post:  Introducing HR/OFFB Park Factors