The Hardball Times

Of runs and wins

by Dave Studeman
December 14, 2012

The Orioles caused a bit of consternation on the Interwebs this season. It's been an article of faith from the Beginning of Sabermetrics (approximately 1980, or 1 A.J. (After James)) that teams tend to win relative to how well they outscore their opponents. If they score a lot more runs than they allow, they win a lot of games. If they don't, they don't.

The Orioles, however, won 93 games and lost only 69 last year, despite scoring only seven more runs than they allowed. This wasn't a record for outperforming a team's run differential, but it was close. As the season progressed, saberists kept insisting it couldn't continue. Yet it did, right up until the end of the season. Anti-saberists seemed to enjoy a certain amount of schadenfreude when the O's made us sabermetric types look silly. O's fans, of course, were delighted regardless.

So what are we to think? Is this Article of Sabermetric Faith wrong? Are we fooling ourselves when we pay too much attention to runs scored and allowed, and not enough to basic wins and losses?

To answer these questions, let's go back to the basics. What's more, let's go back to the data.

I decided to compare two consecutive months of a team's win/loss record to each other, within a specific season. By comparing in-season months, I mostly avoided the hassle of personnel turnover and that sort of thing. To make sure I had enough data, I collected all teams and months from 1970 through 2012, a total of 5,747 team/month/year combinations.

Next, I grouped the teams into 21 buckets, based on how they performed in the first baseline month. (A winning percentage of .000 to .050 was bucket 0, .050 to .100 was bucket 1, and so on. Bucket 19 included winning percentages between .950 and 1.000; bucket 20 was a winning percentage of exactly 1.000.) Some of these months consisted of only one or two games, so I weighted each team/month comparison by the lesser number of games in the two months (if there were 25 games in the baseline month but only one in the next month, the stats were prorated as if there was only one game in both months). That way, I didn't have to exclude any months by an arbitrary baseline, and I also didn't have to worry about those pesky strike years.

For the following analysis, I included only groups with a baseline winning percentage (the percentage in the first month) higher than .200 and lower than .800—12 groups representing 6,546 team/month/year combinations in all.

You may not have followed all that. It's okay. You'll get it when we dive in.

Regression toward the mean


Before we get to the runs scored and allowed thing, we have to do something else. We have to regress toward the mean. Everything regresses to the mean, as someone once said. The Mariners win 93 games after winning 116. Norm Cash hits .243 after hitting .361. I write a lousy column after writing a great one. Sometimes.

The basic rule is this: Before you compare two things to each other, make sure you regress them toward the mean first.

In the table below, you'll see that all 12 groups regressed toward .500 in the second month of comparison. Teams in our lowest group, for instance, had a composite .224 winning percentage in the baseline month. In the next month, their winning percentage jumped up to .422. At the other end of the standings, teams that averaged .762 in the baseline month had a .545 winning percentage in the next month. There remained a difference between the groups, but it wasn't nearly as extreme.

That is as stark an example of regression toward the mean as you're likely to see today. Here are the data for all 12 groups.





























































































From... To... Number Avg. Win% NextWin%
.200 .250 49 .224 .422
.250 .300 191 .277 .461
.300 .350 363 .329 .446
.350 .400 610 .376 .468
.400 .450 1041 .426 .481
.450 .500 835 .472 .496
.500 .550 1202 .519 .506
.550 .600 931 .571 .516
.600 .650 675 .621 .531
.650 .700 374 .670 .543
.700 .750 195 .719 .557
.750 .800 80 .762 .545


Teams in every category—above-average teams and below-average teams—moved closer to .500 in the second month. Much closer. In fact, these findings lead us to a useful rule of thumb:

If the only thing you know about a team is its winning percentage in a single month and you want to predict how it will perform in the next month, add these two things:
{exp:list_maker}25 percent of its winning percentage in the first month, and
.375 (which is 75 percent of a .500 record). {/exp:list_maker}In made-up technical English, a team will regress 75 percent toward average in its second month.

Now let's talk about that runs scored and allowed thing.

Pythagorean variance


Here's what I did next. I took the number of runs scored and allowed per game for each team in its baseline month and used those data to calculate its pythagorean record. The pythagorean record, which is an estimate of a team's record based on its runs scored and allowed, was developed by Bill James around 1 A.J. The basic formula is RS^2/(RS^2+RA^2). RS means Runs Scored and RA means Runs Allowed. That little hat means "raised by," or squaring the number. You may remember a similar formula from your geometry class.

I varied the "squared" part of the formula for each grouping, based on the the run environment of each team. It adds a little more precision to the mess.

So how do we factor the pythagrorean record into our regression? Well, first I created a second group (a subgroup, if you will) of teams, based on their pythagorean record in the baseline month. For instance, if a team's pythagorean record was better than its actual won/loss record by two games (this would be an "unlucky" team in standard sabermetric parlance), it was were placed in group 2. If their won/loss record was two games better than their pythagorean record (a "lucky" team), there were placed in group -2 (negative two). The number of the group represents the pythagorean difference from reality in games won.

Once again, looking at the data output may help you understand what I did. In the table below, I regressed each group's winning percentage toward the mean (using the 75 percent rule) to predict how it would perform in the second month. Then I broke them into pythagorean subgroups to see how each type of "lucky" team performed relative to its regressed projection.

Bottom line: The higher a group's pythagorean difference, the more they outperformed their projected record in the second month. The results are dramatic.






























































Pyth Diff Projection Diff
-7 -.125
-6 -.064
-5 -.057
-4 -.078
-3 -.024
-2 -.008
-1 -.003
0 .012
1 .003
2 .018
3 .015
4 .020
5 .046
6 .235


Picking on one example, teams that were three games better in their pythagorean "runs record" than their actual record beat their regressed projection in the second month by .02 percentage points. Their runs scored and allowed in the baseline month made an impact on the outcome of the second month.

What I'm saying is that baseball analysts are still right. Runs scored and allowed still matter. In fact, if you know nothing about a team except how it performed in one month, add these two things:
{exp:list_maker}30 percent of its pythagorean record in the first month, and
.350 (which is 70 percent of a .500 record). {/exp:list_maker}If you know these two things, knowing the team's actual won/loss record won't help you one bit. You'll do a better job of predicting a team's future if you ignore its actual won/loss record and just use its "runs record."

When James first wrote about these things, he developed a number of sabermetric forces. One was called the Plexiglass Principle, which today we call regression toward the mean. The other was called the Johnson Effect, which is what he called it when teams that had extreme pythagorean variances in one year tended to relapse in the next year.*

*This shows how far we've fallen as sabermetric writers. No one coins terms than James does.

The problem these days is that people tend to throw the two forces together. When people say that teams tend to "fall back" toward their pythagorean record, they're really combining the two ideas. Pythagorean variance has become sort of a lazy man's regression term. Today, I've tried to separate the two more distinctly for you.

Still, there are more questions. There are always more questions, aren't there? What if we have multiple months of a team's record? Is there a point at which its actual won/loss record is more important than its runs record? Is at two months? Three? Four? Are things different these days than how they used to be? Has bullpen usage changed things at all? Do season-to-season effects still hold?

I'll be back.

Bonus Table


Wow. I'm impressed you're still here. As a bonus, I'm going to break out both types of groups in the following table. The top row lists the Winning Percentage groups (from a .200 level to an .800 level, by .050) and the left column list the Pythagorean Difference subgroup. The data in the table is the difference between each group's second-month projection, based on simple regression to the mean, and its actual record in the second month.

Observe and enjoy.


































































































































































































































































Pyth Diff 4 5 6 7 8 9 10 11 12 13 14 15 Total
-7 -.125 -.125
-6 -.104 -.034 -.053 -.064
-5 -.046 -.003 -.062 -.036 -.081 -.015 .005 -.220 -.057
-4 -.316 -.162 -.008 -.024 -.068 -.054 .005 -.015 -.058 -.078
-3 .042 -.088 -.028 -.049 -.036 -.026 -.014 -.014 .007 -.030 -.024
-2 .056 -.063 .005 -.022 -.028 -.008 -.015 -.009 -.010 -.013 .019 -.008
-1 -.017 -.010 -.002 -.011 -.016 -.001 .000 -.004 .008 .010 .013 -.001 -.003
-.023 .006 -.013 -.002 .007 .008 .009 .023 .013 .013 .036 .072 .012
1 -.025 .017 -.003 .011 .007 .017 .017 .013 .023 -.028 -.016 .003
2 .030 .020 -.016 -.016 .031 .031 .002 .009 .006 .084 .018
3 -.009 .025 -.014 .051 .008 .069 .034 -.005 -.028 .015
4 -.051 .087 -.010 .068 .005 .020
5 .056 .036 .046
6 .235 .235
Total -.006 .055 -.010 -.026 -.029 .004 -.007 -.012 -.015 .006 -.016 -.039 -.009


Feel free to ask questions, point out faulty logic and generally make fun of my math in the comments below.

References and Resources
Here's a very mathematical examination of why the Pythagorean Formula works.

All data courtesy of the spectacular folks at Retrosheet.

Dave was called a "national treasure" by Rob Neyer. Seriously. Comments about this article can be sent to him through the miracle of e-mail.

<< Return to Article