My most recent installment on alternate baseball history mentioned in passing some umpiring peculiarities from the decade of the 1900s. Not only were some games assigned two umpires while others got just one, but there were apparent patterns to which teams got the fuller complement of umpires. Good teams drew two umpires more often than bad ones, and New York City was ahead of the other cities in baseball.
I had no time then to delve into this subject: I was using it to demonstrate that a specific game would have had two umpires had the National League not fired one mere days before. Yet I was intrigued by what I learned from this snapshot of the era, and decided to look deeper. Not only was my curiosity engaged, but I also wanted to confirm—or if necessary disprove—the observations I made from a single season’s games in one league.
That’s my subject today, a look at a transitional phase in baseball history. I’ll preface it with some background on the advances and declines of two-umpire crews, to show just how long that seemingly modest step took to establish itself.
Two umps forward, one ump back
From our 21st-century perspective, it’s tough to imagine how major-league games could ever have functioned with just one umpire. Even two seems like negligent understaffing, even if those of us in minor-league towns have seen how that is made to work. In the early decades of baseball, of course, games did function with one umpire, if not always smoothly. Having more seemed an extravagance in leagues that were tottering on the brink of insolvency, which for its first decade included the National League itself.
The first two-umpire games in the major leagues came in the NL in 1888, just a sprinkling. The number rose in 1889, and the American Association, the other major league of the day, followed suit with some of its own. In neither case did the number of two-ump games reach 10 percent. The innovation may have come from competition between the leagues, and it exploded the next year for that reason.
The Players’ League made its debut in 1890, in a direct challenge to the reserve clause system then dominant in baseball. The PL had several choices to make in distinguishing itself from the NL and AA, including whether to charge 25 or 50 cents for admission, whether to serve alcohol at games, and whether to play on Sundays. (The answers: 50, no, and no, all in line with the NL.) The league’s strongest sign of professional standards, though, was assigning two umpires to each game. The other leagues didn’t even try to match this move, fighting the three-cornered war on other battlegrounds.
The PL collapsed after that lone season, and the other two leagues did not adopt its extravagance toward umpires. The National League still had fewer than 10 percent of its games double-umped in 1891; the staggering American Association had almost none. The AA died after 1891 from wounds inflicted during the Players’ League war, its healthiest teams snapped up into an enlarged NL. Primacy ensured, the National League lets its standards sag: in 1892 it had no games with two umpires.
One possible reason that 19th-century leagues avoided two-umpire crews was the instability of umpiring staffs. A lot of umpires had short tenures, either failing to meet the standards of competence or getting sick of the abuse they took from players and fans alike and quitting. It may have seemed folly to assign two umpires to a game months in advance when you were quite unsure those men would still be in the league by that time.
Of course, two-umpire crews allowed a margin in case you did lose one of those umpires: there’d still be someone left to call the game. There were numerous times in the 19th century, and into the 20th, when no league umpire was on hand for a contest. The common solution was for the teams to choose umpires from their own ranks. Trustworthy players not expected to participate in the game, generally one per side, would handle the play-calling.
The one-and-done era in the NL would not last. Teams were getting wise to how they could exploit the holes in a solo umpire’s coverage of the field. Cutting corners around the bases, hiding extra baseballs in tall outfield grass, and other colorful tactics grew common, the Baltimore Orioles being the acknowledged masters of such chicanery. They also employed legitimate tactics, which along with having good players carried them to three straight pennants in the mid-1890s.
The powers of the NL seemed to hit a breaking point, not just with flouting of the rules but the more threatening elements of the “rowdyism” of the time. In 1897, the number of games with two umpires was around 11 percent; in 1898, it leaped past 90 percent. One might like to associate this crackdown with the downfall of the rowdy Orioles, but in truth the Boston Beaneaters had knocked them from their perch atop the league in a wild 1897 race.
This state of affairs lasted two seasons. The 1899 season saw the poisoned fruits of syndicate ball ripen into the bitter experience of the 20-134 Cleveland Spiders. The league contracted from 12 teams to eight, and the belt-tightening hit the umpires also. In 1900, just as eight years before, the National League had no games with two assigned umpires.
Yet again, the pendulum would swing from the spur of competition. The American League made its jump to major-league status the next year, and put two league umpires on almost 20 percent of its games that first season. The NL followed suit a touch reluctantly, and soon both leagues were on the path, with meanders, toward putting two umpires in every game.
It is part of that road that I will be examining.
I chose as my range of study the years between 1904 and 1909. Beyond 1909, two-umpire crews called more than 80 percent of games in both leagues, leaving less room for differences in how they were deployed to manifest themselves. (Two umpires per game became mandatory in 1912.) Before 1904, an equal and opposite effect was in place, as single umpires were assigned to over 80 percent of games in both leagues.
I examined the National and American Leagues separately. Though they had signed the National Agreement in 1903 which gave them relief from player raids and made the World Series possible, it was more a truce than a true peace, and nowhere near a union of the leagues. They were still highly autonomous entities, run by Ban Johnson and Harry Pulliam without needing to consult each other much, definitely not on such a minor matter as umpire hirings. Their independence shows clearly, as we shall see.
For my purposes, only games where two league umpires officiated count. There were certainly contests scheduled to have two, but that one of the umpires missed due to illness, injury, missed train connections, getting fired, or other causes. I don’t have access to league umpiring schedules from that era, if they even exist any longer. I can make guesses about which games would have had two umpires (and I did in one very specific instance leading up to Merkle’s Boner), but that’s rotten statistics, and I won’t do it here.
Excluded from counting as two-umpire crews are substitute arbiters. As noted above, these were generally inactive players, one per team, although other arrangements were possible.
In September 1904, the Browns and Tigers found themselves facing three straight days of double-headers with no league umpire on hand. (They were sixth and seventh in the standings, thus not a high priority.) For the first four games, the hometown Tigers provided both umpires for each game, which the teams split. For the first game on day three, St. Louis finally got to share duty with Tiger Bill Coughlin, who had umped all the previous games. Before the lunacy could be completed, though, AL umpire Charles King arrived to handle the nightcap solo.
Sometimes these games went forward with just a single player-umpire—and it could even be a third party. On Aug. 6, 1905, the Chicago Cubs were hosting Boston in a series finale. The league umpire for the previous day’s double-header, Jim Johnstone, departed between games for reasons I have not yet unearthed. The teams volunteered the next day’s pitchers, Carl Lundgren and Irv Young, to umpire the second game.
Johnstone was still unavailable on Aug. 6 (though he’d call a twin-bill in Pittsburgh the next day), so the teams needed to improvise again. By luck, the New York Giants had arrived in town early for their series starting on the seventh. Left fielder Sam Mertes was in attendance (possibly with teammates), and wound up buying a lot more with his ticket than he anticipated. He was the single umpire for that day’s game!
What really gives that choice its bite is that the Giants had forfeited the previous day’s game in Pittsburgh over a blazing argument with the umpires! With the game 5-5 in the bottom of the ninth, Pirate Claude Ritchey was called safe on a close play at third. Details of the following dispute vary widely, but what’s agreed is that George Bausewine (in his only year umpiring in the bigs) forfeited the game to Pittsburgh, and the Giants fled for their lives from an incensed mob of fans. And this is the team from which the Cubs and Beaneaters selected their umpire.
The practice had almost run its course. After six instances in the NL and seven in the AL for 1907, the leagues managed enough accord to clamp down on this unprofessional stopgap. The lone occurrence in 1908 was in a meaningless October American League game, and it would be the last the AL would ever see. There was a leaker as late as 1912 in the NL, with Brooklyn catcher Ed Phelps and Pittsburgh outfielder Ham Hyatt officiating. The last major-league instance ever was in the 1914 Federal League. Pitchers Bob Groom of the St. Louis Terriers and Bert Maxwell of the Brooklyn Tip-Tops were the last active players to umpire a major league game.
And with that lovely anecdotal digression complete, on with the data.
One for you, two for me
I compared how often teams had two league umpires for their games with the records they posted in those seasons. There were significant correlations in both the National and American Leagues, but the numbers were substantially higher in the NL. The charts that follow count a team’s two-umpire games above or below the league average for that year.
The correlation coefficient for the NL was 0.686, showing a fairly strong relationship. The coefficient of determination R-squared would be about 47 percent, meaning team records would explain just under half of the variation in umpire assignments.
The trend ine is clearly flatter here, even taking into account the somewhat different vertical scale. The correlation coefficient is 0.415, indicating just a moderate relationship. R-squared comes in at 17 percent.
Of course, one cannot say what a team’s record is going to be in advance of the games being played. Umpire assignments late in a season might be made on the basis of who was in the pennant race, but early on it would be a matter of speculation. The easiest basis for that speculation would be the previous year’s records. I therefore also compared umpire assignments to team’s records in the year before.
For the National League, a somewhat stronger correlation emerged, with R at 0.729 and R-squared now above 53 percent. NL executives apparently leaned more on previous performance. A simple regression of the numbers found the best correlation at 67 percent previous year’s records and 33 percent current year’s records, nudging R up to 0.744.
The AL, though, was a different story again.
Correlation with previous year’s record was weaker, R at a mere 0.334 and R-squared at 11 percent. Regression found the best balance at 64 percent current season’s records and 36 percent previous year’s, but the R-squared even for that fell shy of 20 percent.
The two leagues were acting like two separate entities. AL president Ban Johnson, or whoever he delegated to draw up umpires’ work schedules, paid only slight attention to how good the teams were in sending one umpire or two to their games. NL president Harry Pulliam and his assistants took it very much into consideration, especially the past year’s numbers. The different emphases may not have been personal quirks, though, but considered responses to the mobility of teams in the standings.
The National League was in something close to a stasis early in the 20th century. From 1901 to 1913, only three teams—New York, Chicago and Pittsburgh—won the pennant. From 1903-1913, not once did any of those teams finish worse than fourth; during the same period, Boston, Brooklyn and St. Louis never finished better than fifth. For the whole period I examined, there were three teams, arguably as many as five, that just didn’t matter to the pennant race except as schedule filler.
The American League was far more fluid. The 1904 pennant winners, the Boston Americans, were last by 1906. Their close challengers in 1904, the New York Highlanders, went from second to sixth, back to second, to fifth, and finally to last in 1908, despite leading the standings at the start of June. The Detroit Tigers would win three straight pennants starting in 1907, but had come in sixth just the previous year. Anyone wagering on the AL’s order of finish based upon last year’s standings would be broke in short order.
Different circumstances probably led to different criteria for crafting umpires’ schedules. This was true in more than one area, as another factor setting the leagues apart was how they reacted to the magnetic force of the nation’s largest city.
Two tales of a city
Even a quick eyeball check of the umpiring numbers for the National League shows something funny happening with the New York Giants. As a perennial first-division team in the NL, we’d expect them to get more than their share of two-umpire games. But the margin is so wide, so consistently, that one must conclude that more is going on.
Season 1904 1905 1906 1907 1908 1909 Avg. 2-Ump Gms. 50 28.75 67 69.25 61.5 110.5 NY's 2-Ump Gms. 98 74 104 94 113 145 NY's Finish 1 1 2 4 t2 3
The numbers even buck the correlation with previous-year finishes: the biggest margin above average occurs in the year after the Giants’ worst finish in that stretch.
My hypothesis was that, New York being the biggest city in America as well as its press capital, the NL treated the Giants as its showcase team, giving them extra shares of umpires to make their games more “professional.” There was another factor involved, though: the nearby presence of the Brooklyn Superbas.
Even though Brooklyn had been absorbed into New York City in 1898, its team did not share the marquee value of the Giants. Despite that, the Superbas generally ran ahead of their expected number of two-umpire games, to a lesser degree than the Giants.
There seems to have been a practical synergy going on here. Shipping umpires across the eastern half of the country chasing games cost money for train tickets, and could lead to slow connections that would risk a game having zero umpires rather than one or two. Sending them from Manhattan to Brooklyn, on the other hand, cost five cents for a subway token. A league economizing so much on its umpires that it didn’t hire two for each game would appreciate that extra bit of savings.
I ran a regression on the umpire numbers for both teams. My conclusion is that the Giants got two-umpire crews an added 24 times per season over the other teams in the league, and the Superbas an added 13 times. Take away the marquee and cheap-travel effects, and this means the Giants would have played 16 fewer games with two umps and the Superbas nine fewer, while each of the other six teams in the NL got about four more games with two sets of eyes.
Again, the American League did things differently. It had its own New York team, the Highlanders, who would eventually become the Yankees. Ban Johnson had shepherded the team over from Baltimore in 1903, and, ahem, persuaded other AL teams to stock it with legitimate ballplayers. If he would do that to insure that his league’s showcase franchise played respectable ball, surely he would lean that way in assigning umpiring crews.
But he surely did not. The Highlanders got no benefit from being in New York. The regression actually shows them losing a two-umpire game per season, but that’s so small a deviation for the sample size, it’s effectively zero. Once again, Johnson and the AL proved even-handed in where their umpires went.
The question of fairness
Did these differences carry an ethical charge? Was Ban Johnson being fair and Harry Pulliam unfair? Today, one of the objections raised against expanded replay in baseball is that some games have more camera coverage than others, generally ones in big markets with high-payroll teams. Thus, it would be unfair to improve umpiring more for those games than for others.
That’s a surprising parallel, one that sneaked up on me. I’m not lobbying to take over Jack Marshall’s job as the THT Baseball Annual‘s writer on ethical matters by any means. Still, I can share the thoughts I developed on the matter before the connection to 21st-century arguments blossomed before my eyes.
One important point is that we don’t object to having more umpires for games that are more important: indeed, we expect it. The World Series went to a four-umpire crew in 1910, two years before two umpires became mandatory in the regular season. That rose to six for the 1947 Series, five years before four umpires became the regular-season standard. In 1949, the final two regular-season games between the Red Sox and Yankees that decided the pennant had six umpires*. There was no outcry that this was unequal treatment.
* I learned this by listening to an archived recording of the radio broadcast of the pennant-deciding game. Rough quality from an ancient recording notwithstanding, it is such a treat. And stay with it for the singing near the end. I won’t give more of a hint than that.
Pulliam to a large extent, and Johnson to a slight one, was trying to follow this principle. The games likelier to affect the pennant race were likelier to have two umpires. The stumbling block is that this was usually based on assumptions rather than facts, even though the assumptions were generally good ones in the National League. Using two umpires for the Cubs-Giants game on Oct. 8, 1908 that decided the pennant was perfectly justified. Using two umpires for every single Cubs-Giants game, all the way back to May, while the majority of games made do with one, was imperfectly justified.
As for the New York City leanings, I render another split decision there. In an era of shoestring logistics, team schedules were drawn up to create a reasonable minimum of travel for the various clubs. Assigning umpires for shorter travel falls in the same practical category, so the Giants-Superbas shuttle gets a pass. One could argue a different kind of practicality for treating the Giants better outside the Brooklyn connection, but while it made for good marketing, it made for a poor appearance of fairness—though one wonders how many people noticed at the time.
The true underlying problem was not how two-umpire and one-umpire games got split up, but that there had to be any such triage in the first place. The National League wobbled back and forth for two decades between one and the other, mostly toward the low end. Having six umpires covering four games in 1908 was virtually the peak of extravagance for the league. The AL stayed fairly close to its rival league’s numbers, drifting upward but not making a decisive move toward full two-umpire staffing.
The situation would not last much longer. Calls to have two umpires for every game reached a crescendo in 1908, partly due to the Merkle Game demonstrating how important that second set of eyes could be at a critical moment. Umpires per game leaped in 1909, as the chart up-screen shows, and kept inching higher until two-man crews became the official benchmark in 1912.
There would be similar phases in baseball history, in the transition from two umpires to three centered in the 1920s, and from three to four beginning around 1940. (I may take a future look at these, if any interesting patterns pop up.) If MLB ever goes from four umpires to six in the regular season, one suspects it will happen en bloc rather than piecemeal. One could thus say that the matter of umpire crew triage is a historical curiosity, with no connection to the game being played today.
Except for that pesky matter of instant replay. Does how the two leagues balanced umpire assignments in the first decade of the last century hold lessons for how we might deploy video assistance for umpires in the second decade of this century? I would not have thought so a few days ago, but today, I just might.
References & Resources
I’ve said it before, and I’ll say it again. Retrosheet: single-handedly justifying the Internet since 1996.
Other help came from Baseball-Reference, online archives of The New York Times, and Cait Murphy’s Crazy ’08.