Consistency is Key (Part 2)

by Sal Baxamusa
December 17, 2007

Last time, I busted out the Weibull equation to look at how the number of runs scored in a game is distributed. This allowed us to check out which offenses were consistent at scoring runs and those that were not, while controlling for overall team quality. I promised that we would look at how teams distribute their runs allowed—and we’ll do that.

But first, a disclaimer

I must admit that I changed the model, yet again (you can skip the next three paragraphs if you want—it’s boring mathy stuff). I had a number of people comment that it doesn’t make sense to compute the gamma parameter individually for each team, and to compute a different gamma for the runs scored distribution and runs allowed distribution. The argument basically boils down to whether the run environment is best described at a team level or at a league level. I can see the merits of both sides. On the one hand, you have the unbalanced schedule—the Dodgers played a lot of games against the poor offenses of the Padres, Diamondbacks, and Giants—and on the other hand, you have issues of sample size: why use the environment of a single team when we have data from the whole league?

It is quite easy to figure out which model makes more sense if we keep in mind that the parameter that describes run environment, gamma, is the exponent in the Pythagorean equation. We can compute gamma both ways—on a team-level and on a league-level—and see what kind of Pythagorean records we come up with. I’ll spare you the details, but if we compute gamma on a team-level (as we did last time), we end up with some wacky conclusions (like predicting last year’s Red Sox to win 120 games). It’s clear, at least to me, that the model I presented last week is plain old screwy. So, the new model is one where gamma is computed on the league-level, one for the AL and one for the NL.

What is presented here is the result of a model that simultaneously computes the 60 alpha parameters (one for runs scored and another for runs allowed, for all 30 teams) and the two gamma parameters (one for the NL and one for the AL) that results in the best fit of the model to one year’s worth of data. That’s a pretty tall order, but luckily for us it is a task that a computer does not mind doing!

Changing the model pretty significantly changes the results of last week’s article, too. Fortunately, the lists last week were basically “junk” lists, in the sense that they were simply a record of what happened and not an attempt to divine information about team strengths and weaknesses. So I have no problem throwing away the previous work and re-presenting it. You can view the updated lists over at THT Notes.

How teams allow runs

So with that out of the way, let’s take a look at how teams allow runs; in particular, we’ll check out the teams that didn’t quite follow their predicted runs allowed distribution.

First up, the teams that allowed between 0 and 2 runs more frequently than expected:

Allowed 0-2 runs more frequently than expected
SEA    +7.7
CHN    +7.2
NYN    +6.9
...
CLE    -7.3
WAS    -10.4
FLO    -15.0

The Marlins run prevention corps were quite bad last year, ranking last in the NL (by sizeable margins) in both DER and ERA. On top of that, they gave up zero, one, or two runs in a game far less than what expected, even for a team this poor at preventing runs. On the flip side, the Cubs had a very good pitching staff and team defense, and they gave up zero, one, or two runs in a game far more than expected. That’s a good thing for the Cubs, as giving up a small number of runs naturally results in more wins, so giving up a small number runs more than expected should result in more wins than expected.

Of note, but not on the list, are the San Diego Padres. Last year, the shut out their opponents an incredible 20 times! A team with San Diego’s overall run prevention chops could have expected to shut out their opponent about eight times. Part of that difference was Jake Peavy, who allowed zero runs in seven starts, and part of that was an incredible bullpen that, for example, also allowed zero runs in six of those seven games.

Allowed 8+ runs more frequently than expected
NYN    +9.0
CHN    +6.1
COL    +5.5
...
DET   -6.7
TEX   -7.5
FLO   -9.0

The Mets let runs cross in large quantities in nine more games than we would have guessed based on their overall pitching and defense chops. Was it a case of Willie Randolph letting his pitchers get abused in games that got out of reach? Without going through the game logs, I am not sure. But it is an interesting thought – can bizarre run distributions on the runs allowed side of the ledger be attributed to managerial strategy?

Finally, an overall ranking of team run prevention, from most consistent to least consistent, using the same definition as last time: number of games in which teams allowed less than three or more than seven runs over what is expected using the Weibull model. In other words, teams at the bottom of this list gave up between three and seven runs quite often and teams at the top either shut down their opponents or got blown out a lot.

Run prevention, ranked by most consistent to least consistent
FLO    23.5
MON    15.2
TEX    13.5
KCA    10.3
ATL 	6.3
DET 	5.4
CIN 	4.9
SFN 	3.9
CHA 	3.5
OAK 	2.4
MIL 	2.0
CLE 	1.6
LAN 	1.1
TOR 	0.9
BAL 	0.8
PIT    -0.3
PHI    -0.3
ARI    -0.7
HOU    -0.9
NYA    -2.1
COL    -3.6
TBA    -3.7
SDN    -4.1
BOS    -4.4
SLN    -4.5
MIN    -6.3
SEA    -6.8
ANA    -7.9
CHN   -14.1
NYN   -16.5

Bigger issues

I have said in the past that when a team outperforms its Pythagorean projection, there are two possible culprits:

1. The team had a strange distribution of runs allowed and runs scored.
2. The team won or lost a disproportionate number of blowouts or close games.

With this model, we can understand the importance of the first issue, run distribution. What is the relationship between actual (not theoretical) run distributions and wins?

Based on a team’s actual distribution of runs scored and runs allowed, we can make a guess as to how many games we think they would win. The simplest and most naive way to do it is to pair all possible combinations of runs scored and runs allowed, weighted by their frequency, and assigning a win when the number of runs scored exceeds the number allowed. This assumes that there is no relationship between the number of runs a team scores and number it allows in any given game. That is probably false, but it allows us to somewhat separate run differential from the second issue, margin of victory.

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

Of course, this doesn’t predict exactly how many games a team would win, but neither does the Pythagorean theorem. It turns out, however, that using the run distribution to predict the number of wins is quite good and explains something like 85% of a team’s Pythagorean differential. That is, the amount that a team over or underperforms its Pythagorean record is explained in large part by the team’s actual (not theoretical) run distribution.

So does that mean that margin of victory only accounts for 15% of a team’s Pythagorean differential? I doubt that it is that simple. Run distribution and margin of victory are probably intertwined, so if we ran a similar analysis on margin of victory, we might find that it also accounts for a large portion of the Pythagorean differential.

So, I’m sorry to say that I don’t have a definitive, quantitative answer, but suffice to say that the way a team distributes the way it scores and allows runs is fundamental to why teams over- or underperform their Pythagorean record. Given that, can teams control their run distributions by intelligent roster construction? I am fairly certain that the answer is no. The correlation between “consistency,” as defined here, and the typical metrics (HR/PA, BB/PA, AVG, ISO, OBP, SLG, etc.) is weak to non-existent. (There may be something to how individual talent is distributed on a team – like a single pitcher or hitter that is much better than his teammates – but I haven’t checked that yet.)

It seems that you can’t build a team to have a distorted run distribution, and therefore one that will beat its Pythagorean projection based on a bizarre run distribution. We can’t even really explain why teams have distorted run distributions. We hate hearing it, since it doesn’t jive with our notions of sporting competition, but this should be considered evidence that deviations from the Pythagorean projection are largely – mostly, even – just (gulp!) luck.

References & Resources
For the updated run distribution plots and reports, click here.

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG