Familiarity favors the offense. This axiom has been sabermetrically supported and widely accepted over the last several years. Batters do better each time they see a pitcher during a game, beyond what we would expect to see due to a tiring arm. This rule now influences how a lot of people view the strategic maneuverings of the middle and end-games, and has inspired some radical thought on the usage of pitchers.

This principle, if sound, should also operate on a higher level. If you play against a team and its pitchers often enough, your offensive output per game ought to begin rising. The effect of seeing all the tricks that a pitcher has to offer should accumulate over a season or longer more surely than it does across a single game.

So does it? Do teams put up bigger numbers when playing teams that they see a lot? Given the scheduling patterns of the major leagues, the question can be posed a different way: Do teams put up bigger numbers when playing divisional rivals, and smaller numbers against inter-league foes?

### In the divisions

I initially took as my period of study the years 1998 through 2012, the era of the unbalanced leagues. I wanted to avoid the complications of the Milwaukee Brewers moving from the American to the National League. My cutoff, though, would give me a chance to see how long it took Milwaukee to become “familiar” to the rest of the NL Central, and have its runs per game come into line with its new neighbors.

This reckoned without one important factor, however: the balanced schedule in place from 1998 to 2000. Today, and for more than a decade previous, teams play 18 or 19 games a year against teams in their own division. From 1998 to 2000, with MLB feeling its way through a new situation, the numbers were much lower. NL teams played between 11 and 13 games against division rivals, and generally nine against other NL squads. (There were even a few instances where a team played more often against an extra-divisional team than against some of its divisional foes.)

This isn’t really enough of a gap to say that teams are more familiar with intra-division opponents, so I ended up using 2001 through 2012 for most of the study. I did look at the Brewers’ numbers in those three years, though, and they seemed to fit into their division pretty fast. Milwaukee’s divisional games in 1998 had 9.86 runs scored per game, against 9.38 for all their games. (The division tended toward pitcher-friendly parks, but not nearly enough to account for the gap.) The next two years before the unbalanced schedule came were, taken together, about even. If there was any acclimation period, it didn’t show.

In the 2001 to 2012 set, I had to be mindful of a further confounding factor: the designated hitter. The American and National Leagues play in different run environments primarily due to the presence of the DH in the AL and its absence in the NL. I thus had to examine the leagues separately. Also, I had to filter out interleague games, because playing a number of road games in another league’s park each year pushes AL run numbers down, and NL numbers up, from where they would be in league games.

This accounted for, I could proceed. I tabulated games played and runs scored and allowed for every team in every year: overall, in division, and inter-league. I compared groups of games against each other year-by-year, division, non-division, and inter-league. The yearly results for the American League, intra-division games measured against other league games, look like this.

Even without the big spike in 2010, the numbers back up the hypothesis. For the full 12 years, intra-divisional AL games scored 0.145 runs more than games out of the division but still in the league. Exclude 2010, and it still comes to a 0.07 run-per-game uptick—but I am inclined to let 2010 count. (If the outlier had been a downward spike, I wouldn’t spare myself, and you wouldn’t either.)

Our confidence in the pro-offensive effect of opponent familiarity thus bolstered, we can now look to the National League for further confirmation. Due to the larger league, there end up being fewer games played against non-division league foes, making the gap between those and intra-division numbers wider. If anything, we should expect an even larger run-scoring gap.

Instead, we get that. The 12-year average has NL teams scoring 0.027 runs **fewer** in divisional games than against the rest of the league. Familiarity is now retarding offense instead of boosting it.

Had this all come from the low 2001 figures, I could have rationalized it away: It could plausibly take a year for the increased frequency of intra-divisional play to take full effect. For several years, it looked like that’s what had happened, but that apparent effect wore off, or never really existed in the first place. The 11 other years do combine for a scoring rate slightly higher than for non-divisional games (about 0.01 runs), but I just got through saying I wouldn’t spare myself a downward spike, and so I will not.

Is there some difference between the leagues that would explain the discrepancy? I am obligatorily drawn toward the designated hitter, or lack thereof. One expects NL teams to pull their starting pitchers earlier, to put in pinch-hitters when the pitcher is due to bat. This does happen, but less than one might expect. For the 12 years covered, AL starts were longer than NL starts only by about a fifteenth of an inning per game, a fifth of an out. That’s not anywhere near enough compounded familiarity with pitchers to explain the AL’s lead.

Longer stints by relief pitchers don’t add much, either. I counted the frequency of relief appearances lasting at least 10 batters, where hitters start seeing the relievers a second time and thus gain the edge we’ve been talking about. Over the last 12 years it’s happened about 31 times a year per team in the AL, and 22 2/3 times a year per team in the NL. That may add a bit to the AL’s advantages in pitcher familiarity, but again not nearly enough to explain the wide gap.

Aside from those discredited possibilities, I am stumped to think of what could cause the spread in the numbers. I could declare “variance” and leave it at that, but this is a lot of games across which that variance is holding up: 5,807 divisional games in the AL, 7,371 in the NL.

For the moment, I will leave that conundrum and focus my attention on the other question I posed myself.

### Out of their leagues

If we expect frequently renewed rivalries to produce higher scores, it makes sense that games between teams that seldom face each other ought to go in the other direction. There’s a category that fits that description nicely: interleague games. Over the period being studied, interleague opponents faced each other three times a year, or sometimes six in rivalry match-ups such as Yankees-Mets and White Sox-Cubs. That’s below the numbers for in-league opponents, in or out of the division. So do these games score fewer runs?

I could not do this by league, because of the effects of the designated hitter, for reasons laid out above. I had to take an average of runs per game in the separate leagues and work from there. Mind you, I don’t mean the major league average of runs per game. With unbalanced sizes, the NL played more games than the AL and thus skewed the average its way. (For a perfect example: the 2000 AL scored 5.30 runs a game, and the 2000 NL scored 5.00, but the major league average was 5.14.) I took AL runs per game and NL runs per game and averaged them, to get the proper run environment for a game featuring one AL and one NL team.

The run totals for interleague games above or below that baseline are in the chart below. Relatively stable numbers of interleague games allowed me to include the years 1998 to 2000. Runs per game are counted on the chart for individual teams, not for the two together.

For the 15 years surveyed, covering 3,246 interleague games, there is an overall run-suppressive effect, if a slight and inconstant one. The games average 0.02 runs per team per game lower than the adjusted major league average. (0.0197, if you like your decimal places.) That’s 0.04 total runs per game, to put it on the same scale as the previous calculations. This falls roughly between the AL and NL results, though closer to the lower NL figure.

### Conclusion

Familiarity favors the offense—at least if your sample is large enough. I asked whether the offensive advantage of seeing a pitcher more often in a game carried over on the scale of a full season or series of seasons. It did for AL intra-divisionals and in the counter-example of interleague games, but was mildly reversed for NL intra-divisionals. The bigger margin in the AL games means that the intra-divisionals average out to a net offensive gain, roughly around 0.05 runs per game.

This is scarcely a dramatic difference, but it does exist. Naturally, familiarity with a division opponent’s pitching staff is an evanescent thing. Pitchers will leave their teams for various reasons, or occasionally alter their repertoires enough that they effectively become different pitchers and batters have to figure them out all over again. This may be why the effect is as small as I have measured it, less than one percent of the average runs scored in a game (taking the AL and NL together).

So the next time you see the Red Sox and Yankees, or the Reds and Cubs, piling up runs in one of their games, the 19 games they play against each other every year may have a little to do with it. But only a little.

**References & Resources**

The thesis on batters gaining the advantage with each appearance against a pitcher was laid out most notably in *The Book* by Tom Tango, Mitchel Lichtman, and Andrew Dolphin. Baseball-Reference supplied the statistics. Lots and lots of statistics.

Rabbit Maranville said...

I’m sure there have been previous studies that attempt to find the difference in runs scored per game dependent on the weather/month. I have not seen any so I cannot reference them for this hypothesis, but there definitely seems to be higher offensive output in the summer.

Could this, if true, confound the results of the interleague findings given that during the sample, they were all played in the warmer months? One would expect a higher differential than .04 runs per game for interleague games given the even more lack of familiarity in those game—perhaps this is the reason why that number is so low?

Shane Tourtellotte said...

Rabbit, you make a very good point. I find myself staring into the abyss of correcting for the months of interleague games, and that abyss is staring right back into me.

Or I could just let this, the first year of schedule-long interleague play, finish up and give us some numbers and start working from there.