Feast or Famineby Sal Baxamusa
November 13, 2006
Do you ever get the feeling that your favorite team has a feast-or-famine offense? Some nights they can't score at all and other nights they tee off for eight runs. It can be frustrating to watch the boys lose 3-0 and then win 8-3, thinking that a more consistent offense would have netted a pair of 4-3 victories. Were there any offenses in 2006 that had a "feast-or-famine" type of output? Are there any readers who cannot see through my shameless attempt to have fun with run distributions?
Here are a few offenses that feasted:
Frequency of scoring > 6 runs CHA 34.0% NYN 34.0% NYA 32.7% MIN 32.1%
And a few that, uh, famined (is this even a real word?):
Frequency of scoring < 3 runs CHN 32.7% PIT 32.7% SD 30.9% TBA 30.3%This isn't all that interesting. The Yankees, White Sox, and Mets all had prolific offenses, and the Cubs, Pirates, Padres, and Devil Rays all had very weak offenses. Their appearances on these lists is not surprising and certainly not informative. Minnesota is an interesting entry on the first list; despite middle-of-the-pack run production, they did excel at scoring seven or more runs in a game.
Now lets see if we can find some feast-or-famine offenses. To do this, we'll normalize the frequencies to MLB average and look for teams that are at or above league average for both scoring fewer than three runs and scoring more than six runs. Only three teams meet these criteria:
<3 runs >6 runs HOU 27.8% (1.10) 27.2% (1.01) CIN 26.5% (1.05) 27.3% (1.01) MIN 25.3% (1.00) 32.1% (1.19)The numbers in parentheses are the frequencies normalized to the MLB average frequency.
The evidence that any team last year had a feast or famine type of offense is not exactly compelling. Still, Houston's offense was well below average, so isn't it impressive that they had a nearly average frequency of scoring seven or more runs?
To find out, let's try a different tack. Instead of comparing a team like Houston to the MLB average, perhaps we should compare them to what one would expect from a team of a similar offensive prowess. One way to do this is to model the runs scored distribution with a three-parameter Weibull distribution:
The Weibull model is based on a paper by Dr. Steven Miller and is described in this paper from his website. Fitting the curve to the data can be a little tricky, but I have described a mathematically non-rigorous shortcut previously at Beyond the Boxscore. A quick explanation is that the curve is fit to an average per game run production, so the model will look different for a teams that average 4.5 runs per game versus 4.8 runs per game. (Most interestingly, the Weibull distribution can be used to derive the famed Pythagorean projection using runs scored and runs allowed; a good way to interpret the model is that if all the data points lined up exactly on the curve then the team would have the maximum likelihood of matching its Pythagorean projection.)
Using this equation to model runs scored per game, we can see whether a team overperformed or underperformed at a particular level of run scoring. For example, check out the runs scored distribution and the Weibull model for the Cleveland Indians:
The open circles represent data points from 2006 and the solid line represents the model prediction. Notice how the model overpredicts the frequency of scoring 7-10 runs but underpredicts the frequency of scoring four and six runs. To see if a team is feasting, we can compare the data points at high values of runs scored to the model prediction. If there is an underprediction, you can consider that as a team is feasting more than you might have otherwise expected for a team of that particular offensive production. You can do the same thing for low values of runs scored to see who famishes more than you might expect.
Teams that feasted based on Weibull model:
Most exceeded projection for >6 runs HOU 7.1 MIN 7.0 NYN 6.1 PIT 5.6(The values given here are games, so - for example - the Twins scored greater than six runs seven times more than you would expect from the Weibull model).
Teams that famined (that can't possibly be a word, can it?) based on Weibull model:
Most exceeded projection for <3 runs CHN 7.8 COL 7.4 SDN 6.1 PIT 5.5Notice that Houston appears at the top of the feasting list but doesn't appear on the famine list. Going by our first methodology, frequency normalized to league average, the situation was reversed. Even more interesting however, is that the Pirates appear on both lists. Take a look at their run scored distribution:
It looks the Pirates were indeed an offense in 2006 that scored 3-6 runs less often than predicted and scored toward the high and low more often than predicted. Other offenses that might be categorized as feast or famine in 2006:
< 3 runs >6 runs combined CHN 3.5 7.8 11.3 PIT 5.6 5.5 11.1 SDN 3.6 6.1 9.8 MIN 7.0 2.6 9.6I should probably point out that I'm not doing any tests for statistical significance, and at this point I would classify these types of numbers as toys. While I haven't checked, I'd be hard pressed to believe that the phenomena being explored here is anything other than random variation. For example, I don't think that you could construct a roster that would be prone to being shut down any more or less than you might otherwise guess based on overall hitting ability. I also can't imagine the manager whose strategy is so far from conventional so as to skew his team's run scoring distribution.
But whether caused by random variation or not, Cubs, Pirates, Padres, and Twins fans who complained about a "lack of offensive consistency" probably weren't imagining things.
And since I know you've been waiting with bated breath, you can download 2006 run distribution plots for all the teams as a zip file by clicking here.
Sal Baxamusa is a graduate student in chemical engineering. He can be reached here.