There’s been a lot of discussion lately about one-run games and the Pythagorean Formula. I know, I know. Boring. But so much has been written about the subject, and so many ideas have been kicked around, that I thought it would be good to have one article that lists some basic facts regarding one-run games and the Pythagorean formula. And in compiling this list, I learned a few new things too.
Won/loss records are primarily driven by a team’s average number of runs scored and runs allowed.
This is old news to most of you, but it bears repeating. The basic elements of the Pythagorean formula are runs scored and allowed, and it “explains” over 90% of the variances in won/loss records over the last five years. For you math types, I got an R squared of .91 when I regressed the basic Pythagorean formula against the last five years of team data. Others have gotten closer to 94%, depending on their sample and the exact formula they use.
The big question everyone wants answered, however, is how to explain that remaining 9% variance, which can be much more than 9% for some specific teams.
Run distribution patterns explain only some of the unknown variance.
I wrote an article two days ago about “run distribution patterns,” showing that sometimes you can learn a bit more about a team’s offense or defense by looking beyond just the average. After I wrote the article, I went back to the data and sort of combined each specific team’s distribution pattern (in runs scored and runs allowed) to see how much of the unknown variance they could explain.
The answer was 1.5 percent. By combining the two, I increased the R squared of my projection to 92.5. Put another way, isolated run distribution patterns explained a little less than 20% of the variance not covered by the Pythagorean formula. That’s good, but it’s not enough.
The rest of the variance is explained by the outcomes of close games.
In a broad mathematical sense, there are two things that affect the unexplained variance:
- Specific run distribution patterns, as discussed above, and
- the timing of those patterns in specific games.
Here’s an example of the second point. Let’s say your team scores four runs the “average” number of times and allows five runs the “average” number of times. But let’s further say that all the games in which they score four runs happen to be the same games in which they give up five runs. They’ll lose every one of those games by one run, of course, despite following the average distribution.
You can capture this second point by analyzing the outcomes of close games, because when run patterns match up in odd ways they usually produce close games. In fact, when I add a team’s record in one-, two- and three-run games to my formula, I explain nearly all of the team-specific variances from the Pythagorean formula, for an R squared of .99.
Here’s a list of how much each factor explains the 9% of unexplained Pythagorean variances:
Record in one-run games: 4.0% Record in two-run games: 1.5% Overall runs/game distribution: 1.5% Record in three-run games: 1.0% Other stuff: 1.0%
Now you know why everyone talks about one-run games. But it’s also important to remember that one-run games account for less than half of Pythagorean variances. In our team stats, for instance, we track teams’ records in both one-run and two-run games, in the category called “Close.” We think it adds a bit more insight to the stats.
The 2003 Detroit Tigers, one of the worst teams in history, won over 50% of their one-run games.
A lot of people think that the outcome of one-run games is pretty much random. As an example, the 2003 Detroit Tigers, one of the worst teams of all time, actually won over 50% of their one-run games. Here’s a breakdown of the Tigers’ record, by margin of the game:
Margin Win% 1 .514 2 .189 3 .208 4 .294 5 .286 6 .000 7 .250 8 .000 >9 .125
As they say, one of these things is not like the other. For the record, much more sophisticated study of this in 1997 and reached the same result.
It’s this kind of analysis that leads some people to make the “strong form” of the argument: that the outcomes of one-run games are completely random. But that’s not really true.
Good teams win more one-run games.
Here’s a graph of the winning percentage of all teams in each of the past five years, according to the margin of victory in each game. I’ve combined seasonal teams into five different groups based on their overall record.
As you can see, a team’s true talent emerges as the margin of a game increases. One-run games do tend to bring all teams closer to .500, but the best teams still win one-run games more often than other teams.
Bill James published an article three years ago in which he reviewed Tom Ruane’s article, and added the useful insight that a team’s record in one-run games can be projected by the ratio of its runs scored to runs allowed, each raised to the power of .865. In other words, he used the Pythagorean formula, but used .865 instead of 2 as the exponent.
So, in essence, the Pythagorean formula actually captures the notion that good teams generally win more one-run games. But it obviously won’t capture unexpected swings in one-run game outcomes. And as we’ve said, wild swings do occur.
You might say that some teams seem to have a particular talent for winning slightly more one-run games.
This is what James covers in the article linked above. In a nutshell, James found that there is some evidence that some teams display an ability to perform better or worse in one-run games independent of their overall talent level. The teams that show this ability have two fundamental traits: they play small ball (sacrifice hits, stolen bases, fewer home runs, etc.) and have good pitching. However, you have to apply this finding VERY carefully. For example, here are the teams that most outperformed their expected record in one-run games from 2000 to 2004:
Team Actual Proj. Diff LAN .586 .515 .070 CIN .534 .474 .060 MIN .559 .503 .056
You can certainly argue that the Dodgers and Twins had many of the attributes Bill listed, but the Reds? Sometimes, things just happen.
By the way, here’s a list of the three relatively worst teams:
Team Actual Proj. Diff KCA .395 .471 -.076 BOS .474 .532 -.058 HOU .470 .521 -.051
Yes, the 2003 Tigers won proportionately more of their one-run games than last year’s World Champions.
Bullpens may have an impact, too.
Seeing the Dodgers on the top of that list, and the Twins third, infers the point many others have suggested—that teams with strong bullpens may win more one-run games. As an example, here’s a list of which teams have the best one-run records over the last three years, compared to their performance in saves and holds.
The logic is unescapable. If your team is leading by one run entering the ninth inning, and your bullpen (think pre-injury Eric Gagne) can shut down the opposition, you will win that game. It could be that the White Sox’s bullpen has been more responsible for its record in one-run games this year than its offensive strategy.
As I say, the logic is unescapable. But I’m not totally convinced. By definition, teams that win more one-run games will have more saves and holds. And there are many teams with great bullpens who have had relatively poor one-run records.
Perhaps the only analysis that can truly answer this question is a complete Win Probability analysis of all one-run games. Let me know if you’ve got some free time.
A single run scored in an inning may be more valuable than one of the runs in a two-run or three-run inning.
There may be a natural conclusion to all of this: that a single run scored in an inning is worth slightly more than a single run scored as part of a two-run or three-run inning. As I said earlier this week, the first two to five runs scored in a game are worth more than the runs that pad your total after about seven. Therefore, it may be true that single-inning runs may be more valuable because they tend to occur in games with lower scores.
So if you want to make that argument, I won’t disagree.
But if you play for one run, you will only score one run.
Two runs are still worth more than one run. And, as Earl Weaver famously said, if you play for one run, you will only score one run. If you sacrifice a runner to second base, you are still giving up an out. I don’t think any of the “things” in this article suggest that there should be a much greater use of one-run strategies.
But I’m not totally against sacrifice bunts either, for many of the reasons outlined in this article. And maybe our list of one-run things can help you better identify situations in which one-run strategies are appropriate.
There will always be a lot of fog.
In Underestimating the Fog (link is to a PDF document), Bill James listed eight “strong form” conclusions that may, in fact, not be supported by the analysis. One of them was:
Winning or losing close games is luck. Teams which win more one-run games than they should one year have little tendency to do so the next year.
As we’ve already seen, James believes that “winning or losing close games is probably not all luck.” I mention this because the “fog debate” continues, and you can read it in this article. It’s an excellent discussion for those of you who enjoy that kind of thing. Thanks to BTF for posting it.
By the way, if you made it through this entire article without yawning, then you’re ready for the ultimate test.
References & Resources
You can improve the Pythagorean formula marginally by taking a different approach to the exponent in the formula. Here is Baseball Prospectus’s approach, here’s another good article by the follks at Prospectus, and here is an excellent article by US Patriot that uses his formula for an exponent: ((RS/G + RS/G) ^ 0.28). Finally, here’s a chart by Tangotiger comparing all the different approaches.
An article in this version of Baseball By the Numbers includes a home/road adjustment for the Pythagorean formula.
Tangotiger has written a program that calculates the expected distribution of runs scored and allowed per game, and then calculates an expected winning percentage. You can download it from his site (at the bottom of the page).
All stats were courtesy of Retrosheet, the best thing on the Internet. Retrosheet’s David Smith has also written two articles about this subject (PDF files):
Do Good Teams Really Win More Close Games?
Patterns of Scoring and Relation to Winning