Thanks to the extremely hard work of the contributors at this site, The Hardball Times keeps on getting better. No doubt by now you have flicked through our latest offering: The 2007 Preseason Annual. There is no excuse not to own a copy—at least I have yet to hear a good one (suggestions welcome by
). Unlike the THT Annual, the preseason book is published as a PDF, so it is literally a couple of clicks away.
Not only do you get access to some great essays on forecasting systems and prospects, you also get the skinny on every team. Wait, there’s more. You're also given detailed three-year projections for every major league player as well as a cadre of minor leaguers too.
These projections are useful for a number of reasons. First, as many of you will know, they are a great tool on which to base fantasy league decisions. Our projections come into their own if you are in a keeper league and need to know whether to invest in, say, Zito or Kazmir. Second, they allow us to present other interesting analyses. One such analysis is to compute projected standings for each division. That is what I want to talk about today.
If you want to skip all the gory technical details then I suggest that you press page down a couple of times and go to the next section. For the geeky among you let me give you a rundown of how I turn the individual player projections into division standings.
The projections give Wins Above Replacement (WAR) for all batters and pitchers. The obvious approach is to add up the WAR for each team, factor in a suitable replacement level and, bingo, we have our standings.
Easy, tiger. Unfortunately this is inaccurate for a number of reasons; I'll list five. First, the playing time analysis hasn’t been created using depth charts. In English that means that if you add up the plate appearances (or innings pitched) for all players in a team it will be 20-30% higher than a team's season-end total.
Second, minor leaguers are excluded. While some may break through, the majority are unlikely to do so and should (mostly) be excluded.
Third, the projections don’t necessarily take into account the correct rosters. For instance, we still have Clemens pitching for Houston and Liriano notching up 134 innings in a Minnesotan jersey! Neither is likely to happen in 2007.
Fourth, in some instances the classification of position is incorrect. The most obvious example is Papelbon who is expected to rack up 85 innings as a closer but in reality could go 150+ as a starter.
And fifth, the win distribution is tighter than that predicted by the binomial (ie, more teams are clustered closer to the 81-win mark than we’d expect). I am told by David Gassko that this is usual when doing an exercise like this and in any case the effect is small.
I won’t bore you with the nuts and bolts of what I did except to say that I adjusted for all of the factors above and more. Actually I lied, I will bore you a bit. For pitchers I identified the guaranteed starters and relievers and attributed all their projected innings to the team. I then assigned the remaining innings (mostly) randomly among the motley crew of other relievers. I assumed 1,440 IP per team.
For position players it was a little easier as there was less competition for places. In most instances there were two players competing for a spot and in that instance I allocated the regular starter all of his projected plate appearances and then filled the remainder with the most appropriate bench player up to 685 PA (the 2006 MLB average). For a utility player I would spread his plate appearances around by position.
Okay, that gives us the number of batting and pitching WAR. Factor in a replacement level of 50 wins—this is the number that makes the average MLB team wins 81; adjust the distribution a little so it agrees with the expected binomial distribution (it is debatable whether this is necessary but I unilaterally decided it was); and voila ... we have our results.
I also calculate the percentage chance of each team winning its division. This computation deserves (and will get) an article in its own right but is easy enough to do with a bit of recursive probability.
Here are the standings for the Senior Circuit. Rather than proffer reams of commentary I’ll focus on the surprises and try to explain them:
Team Win Loss % Chance of winning New York 85 77 36% Atlanta 82 80 24% Philadelphia 81 81 20% Florida 80 82 17% Washington 70 92 3%
Most think that the Mets already have the East sewn up, but our analysis suggests that is a facile thought. The Phillies and Braves should also contend, while the Marlins once again jockey on their perennial dark horse. Natspos fans, on the other hand, have much less to look forward to—2008 perhaps.
Team Win Loss % Chance of winning St Louis 85 77 37% Chicago 84 78 32% Milwaukee 78 84 13% Houston 75 87 8% Cincinnati 73 89 5% Pittsburgh 72 90 4%
Our data predict a strong bounce for the Cubs, which is not surprising after an offseason splurge that made Wall Street bonuses seem inconsequential. The Cardinals will be in the mix too with an offense anchored by Pujols who is projected to contribute nearly 7 WAR! It should be close. Both teams are expected to make the postseason albeit one via the wild card.
Many pundits expect the Brew Crew to contend but it looks like Selig's old team may need to wait until 2008 for their chance—they have only a 13% chance of making it as division champs.
Team Win Loss % Chance of winning Arizona 86 76 34% San Francisco 83 79 22% Los Angeles 82 80 19% San Diego 80 82 14% Colorado 78 84 10%
Two years ago the NL West was the laughingstock of baseball as the Padres won the division despite having a losing record with only a few games to go; last year we saw the opposite with all teams over .500 at one stage. Our projections suggest more of the same is in order for 2007 (only one team will win less than 80 games) but, perhaps surprisingly, the Diamondbacks are expected to finish atop the pile.
Why? Well, it is largely down to the starting duo of Webb and the Big Unit clocking 10 pitching WAR between them. The Giants should come second anchored by a productive Barry Bonds. Many experts have picked the Dodgers to run away with the division but they lack the pitching quality of the Diamondbacks and the hitting depth of the Giants.
Let’s rinse and repeat with the Junior Circuit.
Team Win Loss % Chance of winning New York 95 67 53% Boston 91 71 32% Baltimore 82 80 8% Toronto 81 81 6% Tampa Bay 68 94 0%
This one promises to be quite a ding-dong between the Yankees and Sox. Like most people, we expect the Yankees to have the upper hand, but Boston should make the playoffs too. Although it may be hard to believe, our data suggest that Baltimore will finish ahead of the Blue Jays. That's because our numbers show that Tejada and Roberts will anchor the Orioles offense to 20 WAR, taking some of the heat away from an abjectly poor pitching staff. John Brattain will not be a happy chappy.
Team Win Loss % Chance of winning Minnesota 89 73 43% Cleveland 83 79 19% Chicago 83 79 19% Detroit 83 79 19% Kansas City 67 95 1%
Many say the AL Central is the toughest division in baseball. Our stats disagree and have the Twins romping away with the division. The challenging triumvirate are all expected to finish with just 83 wins. While this might not be too surprising for the Tigers and White Sox, many peg the Indians as World Series contenders. We think the Tribe will fall short because they won't get a whole heap of production from their pitching staff. After Sabathia, Sowers, Lee and Westbrook there isn’t too much to get excited about on the mound.
Team Win Loss % Chance of winning Los Angeles 86 76 36% Oakland 85 77 32% Seattle 82 80 21% Texas 78 84 11%
The AL West will once again be a tight division and all teams are in with a chance. Even though the Athletics haven't properly replaced Frank Thomas and Barry Zito they still have an excellent chance as the Angels (in particular) have failed to capitalize. Of more interest is our belief that the Mariners will break the .500 barrier and finish third. This is largely due to a generous offense with Adrian Beltre, Richie Sexson and Ichiro Suzuki accounting for 10 of 20 (park adjusted) offensive WAR.
I’ll keep this one short, but when one spends an unhealthy amount of time hunkered in front of Excel all manner of crazy things pop out. Here are a few:
- The American League really does seem to have an advantage. In total we project the average AL team to win one more game than the average NL team. (In other words, an AL team is 10 runs better than an NL team.)
- Contrary to popular belief the AL East (not the Central) is the strongest division—the Wild Card is currently Boston’s to lose.
- The NL Central is the weakest division underpinned by four teams with a losing record. Contrast that to the AL East where only one team has a losing record
Over the course of the season we’ll continue to follow how the odds of each team winning its division change. Also in the next month or two we'll tackle a few related topics. One is our approach to calculating the odds of winning; two, is how our projected standings compare to prediction markets; and three is the accuracy of these projections.
I'm keen to hear your thoughts and reactions to our projections. Do you think we are spot on or should we check ourselves into the nearest asylum? Drop me an
and let me know.
References & Resources
It goes without saying that none of this would have been possible if it wasn't for the many hours and hours of hard work that David and Chris have put into developing the projections. I'd also like to thank David for the time he spent commenting on the methodology that I used to pull together the projected standings. Take a holiday guys—you deserve it.