Ah, the Saber-sphere is all abuzz with talk of regression to the mean. Regression to the mean is a fairly simple concept. If, over the past four years, you have a player who has had HR/PA rates of 2.8%, 1.9%, 2.3%, and 2.4%, then suddenly, his rate goes to 7.3%, what should you expect in the next year? (The correct answer is 2.6%, at least that’s what Brady Anderson did in 1997.)

Why not expect 7% again? Baseball fans (and a few front office folk) are remarkably good at coming up with justifications for why one should expect 7%. They’ll might say, “That year, Brady developed a new swing/changed his routine/changed his diet/began dating Madonna. *That *must be the reason for his sudden power outburst!” (The more cynical among you might suggest more nefarious reasons*.) How about another explanation? Brady Anderson got insanely lucky in 1996. It’s not often that fate smiles that kindly on one man for such a short period of time, but… how to explain this without referring to Kevin Federline… let’s just say it doesn’t happen very often.

After a few years worth of data points from 1992-1995, we have a decent idea that in reality Brady Anderson is the kind of guy who hits a home run once every 40 times to the plate (2.5%). In other words, we can be pretty sure *that’s* Brady’s true talent level. When he outshot that true talent level in 1996, it made sense that he was due to come back down to earth the next year (which he did). Or in fancy statistical terms, he regressed to his own mean. His performance regressed (got worse), due to the fact that deep down, he was playing over his head the year before, and the next year, he went back to doing what he usually does.

Exactly how to incorporate regression to the mean is the great knuckleball of Sabermetrics. There are as many theories on how to do so as there are Sabermetricians who have looked at the question. This is because what folks are really talking about is not “how do I regress to the mean mathematically?” That’s actually really easy. The real question is “How do we estimate a player’s true talent level?” In other words, what do I regress back to? What is this player *really *capable of?

Colin Wyers wrote a bit on true score theory in a recent THT article. In the piece, he said that a player’s performance is a function of his true talent level, random error (aka luck), and bias in measurement. He made me happy by including measurement bias in his conceptualization (although he then politely dismissed it). I still think there’s one extra missing piece that he hadn’t considered. Colin began to hint at that missing piece when he talked about Ichiro, who gets a hit in roughly 30% of his at-bats.

“Moreover, based on all those factors–and of course many others–a player’s true talent level changes from moment-to-moment. Ichiro may have a 30 percent chance of getting a hit in one at-bat, but if his jock strap starts to itch, perhaps that goes down to 29 percent the next. On the other hand, if someone in the dugout makes a funny joke(auth note: in Japanese? – P.C.) that puts Ichiro in a good mood, his true talent could go up to 31 percent so long as that good mood lasts.”

The actual equation should look like: Observed performance = true talent + measurement bias + **contextual factors **+ luck/random error.

If there is a great sin of Sabermetrics, it’s that we (and I happily include myself in that pronoun) have treated players as though they were Strat-o-matic cards. That is to say that they don’t respond in the least to what’s going on around them, which doesn’t make common sense (although common sense is not a proof of anything…) We act as if it’s as if it’s just a matter of finding the right algorithim based on last year’s stats plus this year’s stats times prime rate minus the square of blah blah blah… After that, we know what a player has the probability to do. And he’ll do it no matter what situation he is in.

Or will he? Colin correctly points out that we won’t be able to know everything. (I frankly don’t want to know if Ichiro’s jock strap starts to itch.) But there are some things that we *can* know, and know them rather easily, that might make a big difference. Let’s take a truism in life. It’s a lot easier to do your job when you are in a good mood than when you’re in a bad mood, and overall, you’re probably better at the job in a good mood. Does it apply in baseball? Let’s take the simplest rough proxy for a good mood that there is: is my team winning?

**Warning: This is the nerdy part.**

I took the 2008 season, and found all plate appearances in which a batter who had at least 250 PA squared off against a pitcher who faced at least 250 batters, and the score was not tied. (It left about 78,000 plate appearances.) I classified whether the plate appearance ended in an on-base event (I included ROE), or not. To control for batter and pitcher matchup, I took the batter’s seasonal OBP (including ROE) and the pitcher’s OBP against. This is nice because we can use hindsight to get an idea of what each player’s overall talent level was during the 2008 season. Because OBP is stated as a probability (a .350 OBP = a 35% chance of getting a hit), we can convert the percentages into odds ratios with the formula OR = p / (1 – p).

Once we have that, we can figure out what the expected outcome is of this matchup with the formula: (batter OR / league OR) * (pitcher OR / league OR) = (expected OR / league OR). Figure out the expected OR and take the natural log of that number (more on this step in a moment.)

I then put the logged-odds-ratio values into a binary-logit regression equation. Binary logit deals in outcomes that have a binary (yes/no) outcome. Either the batter was safe or he was out. Binary logit models the probability that the answer will be yes or no based on whatever factors are entered in. It does this by modeling the probability as a… wait for it… natural log of the expected odds ratio.

Slip the natural log of the odds ratio based on the expected outcome from the batter and pitcher, and you’ve controlled for the batter pitcher matchup. (The coefficient on that factor should be very very very near 1.00 when you get your output). Now, enter any other predictor you want… including the dummy variable of whether or not the batting team is winning. Because we have 78,000 cases, we’ve got plenty of statistical power to check for significance above and beyond the effects of the batter and pitcher matchup. (Want more power? Add additional years!)

The regression equation that’s produced will give you a predicted log-of-the-odds ratio given all the input factors. I did just that. (To make sure it wasn’t an artifact of the 2008 season, I re-ran the analyses for 2007 and 2006 and got the same basic results.)

**OK, you can open your eyes now.**

What’s the value of the batter’s team winning vs. the batter’s team losing? Let’s say a league average batter faces a league average pitcher (an OBP of .339, including ROE). The generated equation says that if the batter’s team is winning, the expected OBP for that situation is .341. If the batter’s team is losing, it’s .334. That’s a 7-point swing, based entirely on what’s on the scoreboard. Seven points is not huge, but it’s not exactly trivial either.

So, a player’s “true” talent can vary based on whether or not he’s winning. This is very interesting when watching (and analyzing) baseball on a short-term level, and even has some more macro-level applications. Suppose that a player is traded from a team which isn’t very good (and as such, is losing a lot) to a team that is good (and is winning a lot). Or the other way around. Should we not adjust our estimates of his true talent to compensate?

Whether the batting team is winning is just one variable for consideration. I can think of a dozen more. I may not be able to read minds, but it’s not hard to figure out that a player is probably frustrated when in a slump (or when his teammates are in a slump?), angry when a bad call is made by an umpire, or a little more lethargic when it’s cold. All of these variables might be found with a little bit of engineering daring-do. But the point is that it’s time that we started looking a little harder at how the situation effects the men playing the game. Context matters.

Nutlaw said...

Interesting approach, PC, but I’m not convinced about cause and effect here. Does winning make one hit better or do other factors that make one hit better that day tend to put their team in the lead in the first place?

Pizza Cutter said...

I controlled for the strength of the batter and pitcher on this skill overall. That would be the big confound that I would worry about. Could it be a third variable that’s producing an illusory result? Possible. But it’s not likely a talent issue, and more of a game contextual issue. I find that thought unto itself fascinating.

Tom M. Tango said...

Only look if bases are empty (you’ll still get over 50% of your PA).

donchoi said...

How much is this is due to the way a batter approaches his PA differently when his team is ahead vs how a pitcher approaches a hitter when they are behind? I’m not sure we can untangle the two.

Greg Rybarczyk said...

You might want to think about how to account for the home/away factor. A hitter batting in the bottom of the 1st inning can never be ahead, only behind or tied. In the second inning, home batter’s have had one inning to score, while the opposition has had two. And so on. Later in the game it doesn’t matter, but early on, you’ve got a definite factor here…

Have you tried splitting it out by WP? That way, when a hitter bats in the bottom of the 1st after the road team fails to score, the WP is actually slightly tilted towards the home hitter. This might work better than looking at the score…

Joe said...

“His performance regressed (got worse), due to the fact that deep down, he was playing over his head the year before, and the next year, he went back to doing what he usually does.”

Nobody wants to believe this. I spent the month of May telling fellow Red Sox fans that, given regular playing time, Nick Green was going to finish the year with an OPS below .700.

Colin Wyers said...

DSG was the one that wrote about Ichiro, not me. Almost everything else attributed to me is what I said (I think).

What I did say that I think is relevant:

“Obviously a baseball player’s innate ability isn’t constant: He can be nursing a minor injury or learn better plate discipline. A lot of things can happen to change a player’s true talent level. Of course, the same can be said of taking a test, the typical use case of true score theory. A student can be well-rested one day, tired another day, for instance. When we refer to something as ‘true’ we simply mean that it is repeatable under

the same conditions.”Emphasis added. If you change the conditions you change the underlying true talent. The trick in this case is to learn when the player’s skill level has changed.

As far as the example here – I think there has to be something that we aren’t detecting, and I don’t think it’s a player being happy with a lead. My hunch is that if you controlled for the home/road advantage you’d see different results.

birtelcom said...

.339 overall, but .341 if the batting team is ahead and .334 if it’s trailing? Maybe I’m missing something obvious but shouldn’t the winning/losing numbers essentially average out to .339? There are presumably about 5% more PAs where the batting team is trailing than when it is leading (because home teams in the lead don’t bat in the last half of a game’s final inning) but does that fully explain the fact that the variation from the overall .339 figure is .002 on the team-leading side but .005 on the team-trailing side?

Pizza Cutter said...

Colin, speaking of tired people, it was probably 2:00 am when I was crediting you for what David said. Sorry. And David, if you’re reading this, not sure how I mixed that up… you’re much better looking than Colin after all.

When I get home tonight, I’ll look some of these other issues up.

Pizza Cutter said...

@birtelcom: There are a bunch of PA’s with the game tied which are not part of the analysis. .339 is league average performance overall, whether the game is tied or one team or the other is winning. The “tied” PA’s were then politely dismissed from the study. I didn’t run the numbers on the predicted OBP in a tie game, but my guess is that the resulting number would weight things out to .339.

John Beamer said...

Hey Pizza – interesting study. Any chance you could link to your logit regression analysis (or stick it on google spreadsheets or something). Would love to take a look

John Beamer said...

Hey Pizza – interesting study. Any chance you could link to your logit regression analysis (or stick it on google spreadsheets or something). Would love to take a look

Pizza Cutter said...

Answers to a few questions that were asked.

With the bases empty (46K cases)

Batting team losing: .328

Batting team winning: .332

Innings 2-8 (to control for the top of the first and bottom of the ninth being times when the home team can not be ahead)

Batting team losing: .336

Batting team winning: .341

Just the home team batting

Batting team losing: .340

Batting team winning: .350

Just the visiting team batting

Batting team losing: .328

Batting team winning: .332 (sic)

Interesting moderator effect for the home/away splits, but it’s clear that the winning vs. losing effect never fully goes away.

Pizza Cutter said...

John, what part of the regression are you looking for? The equation? The data file? I’ll be happy to share/post.

Pizza Cutter said...

Removing all IBB from the data set:

batting team losing: .331

batting team winning: .332

Peter seems to have poked a hole in my balloon. Good catch.

KJOK said...

Perhaps I’m missing something, but couldn’t it just as easily be the PITCHER that is feeling bad/good (or maybe PERFORMING bad/good that day) since he’s winning/losing, instead of the batter?

Guy said...

You might look separately at starting pitcher vs. relief pitcher, to see how much this is that starter having a good/bad day (as KJOK suggests).

Another factor is hitter platoon advantage (yes, no). The winning team might enjoy an edge there, on average.

John Beamer said...

Pizza – just the equation and the associated statistics if possible – would love to take a look that’s all

Thanks

John

Peter Jensen said...

Pizza – The entire difference in OBP is due to the extra intentional walks that are issued to a team that is in the lead.