Starting Pitcher Leverage (Part 1)

I’ve always loved hearing old time stories about baseball, and how they did things differently back in the day. Local lore here in Chicago proclaims that Al Lopez used staff ace Billy Pierce as often as he could against the best opposing teams—the Yankees and Indians—to give the White Sox a little additional chance. Baseball’s memory also tells us that Casey Stengel did the same with Whitey Ford, ensuring numerous clashes between the rival lefties. Heck, I’ve read stories about how the Cubs used Mordecai Brown that same way against John McGraw‘s Giants over a half century before.

If there’s any truth to these stories, teams often leveraged their pitchers against opposing teams to give themselves an extra little edge. This alternate method of handling starting pitchers would have warped their stat lines. If the Sox really used Billy Pierce that much more against the best offenses, his numbers would underestimate how good a pitcher he was.

A few years ago, in a series of posts over at the Baseball Think Factory, a researcher named Dick Thompson took this a step further when discussing information he’d uncovered for a book he was then finishing up. Rather than settle for misty stories about how pitchers might’ve been used back in the day, he checked through the game logs of Hall of Fame also-ran Wes Ferrell and concluded that history has underrated the former AL stud. Thompson went so far as to claim that if you account for the quality of opposing offenses, in his prime Wes Ferrell was as good as Lefty Grove. Steep praise indeed.

These old stories, and more importantly Thompson’s research for his book inspired me to research the subject. Aside from the digging on Ferrell, people have either ignored leveraging when judging bygone starting pitchers, or relied entirely on the hazy mist of folklore to adjust a pitcher’s worth. No one has ever undertaken a systematic study of multiple pitchers to see how teams leveraged their starters, and who the best and worst leveraged starters of all-time were. Well, no one until now.

The Boring Part: The Mathematical Crud

To understand the mathematical goobledy-gunk behind this, let me use Mordecai Brown as an example. Looking at his starts over at Retrosheet, it turns out those old stories about him were true. The Cubs did use him more often against the best teams. In his big years as an ace, 1906-1911, the top three teams were almost always the Cubs, Giants, and Pirates in various orders. (In 1907 the Phillies just barely edged of the Giants, but otherwise it was perfect).

In his 182 starts in that span, Brown began 42 games against the Giants and 33 against Pittsburgh. That’s almost 50% more than he would have had he faced all teams evenly. Looking over the course of his career, he should’ve had 160 of his 332 starts against teams with a .500 record or better. He had 194, over 20% more. That’s nothing. Against teams with a .600 or better record his teams started him more than 30% than they would’ve had he been used evenly.

There’s a more precise way to quantify this. Let’s take his 1909 season. Here’s how manager Frank Chance used him:

Team        Win%      GS
Pirates     .724       8
Giants      .601       9
Reds        .503       4
Phil        .484       3
Dodgers     .359       4
Cards       .355       2
Braves      .294       4

Now that’s leveraging. To quantify it, first multiply his starts against the Pirates by their team winning percentage. Now do that for all the teams, add up the products, and divide by his total starts. It works out to an Average Opponent Winning Percentage (AOWP) of .529.

That’s nice, but it doesn’t tell us how he was leveraged. A pitcher on the infamous 1899 Spiders could get an AOWP of .529 while not being leveraged at all. To figure out his leveraging you need to figure his Team’s Opponent Winning Percentage (TOWP), and divide his AOWP by the TOWP. Retrosheet makes this easy because it has the specific number of games any team played against all comers, including the 1909 Cubs.

That year the Cubs had a TOWP of .475. Divide the AOWP by the TOWP, multiply by 100, and round off at the nearest integer to make it pretty. Call it AOWP+, and it’s how you quantify starting pitcher leveraging. Mordecai Brown had mark of 111 in 1909, which is damn impressive. Doesn’t exactly roll off the tongue, but like OPS+ or ERA+, it’s centered at 100. A higher number indicates his team used him more often against better teams, and lower means he faced worse teams.

Grunt Work

So … who do you figure AOWP+ for then? I know managers don’t leverage starting pitchers now, but I don’t know when they stopped doing it. For that matter, I don’t know when it began. Sure would be nice to know these things so I could figure the AOWP+ for as many pitchers during the leveraging era as possible without wasting my time looking at a bunch of guys from years without leveraging.

Fortunately I have a theory. If anyone’s going to get leveraged, it’s going to be ace pitchers like Brown, Pierce, or Ford. Thus, to determine when leveraging happened, I checked the AOWP+ for all starting pitchers in Cooperstown, or in the Top 100 pitchers in the New Bill James Historical Baseball Abstract, or who made 400 starts or had 200 wins, or 200 win shares.

This first run indicated there’s a certain signal to noise ratio within AOWP+. An AOWP+ of 102 or 103 could mean that a team moderately leveraged that pitcher or that random happenstance over the course of 30 starts gave the pitcher a slightly higher AOWP than his team. However, around 104 and certainly by 105, the signal to noise ratio drastically improved. Looking at the results of this first run, starting pitcher leveraging existed from 1886—when both the NL and American Association expanded their schedules—until 1964 when Mickey Lolich scored a 106, a number no one else has ever reached since then. In the mid-1960s, AOWP+ scores noticeably declined and never recovered.

For my main analysis, to ensure I calculated everyone worth examining, I expanded the period to 1876-1969, and examined an absurd number of pitchers in those years. I tabulated AOWP+ for every pitcher with at least 85 Win Shares as a starter or 150 starts in that period. Oh, OK, I missed one or two. My apologies to those legions of Charlie Sweeney enthusiasts out there. I also missed ex-Met Al Jackson, but his splits indicate I didn’t miss anything interesting.

Aside from the 150/85 boys, I analyzed several dozen other pitchers who looked intriguing. Really, how can you do this study and skip Hugh “Losing Pitcher” Mulcahy? I picked up phenoms like Herb Score, best pitchers on bad teams like Mulcahy, and anyone else who caught my fancy.

Between the two runs I had analyzed 659 pitchers who started over 182,000 games, including two-thirds of all games from 1876-1969. Heck, I included over half the starts in every season from 1879-1970 except 1884. Toss out the Union Association and it’s every year in that span.

Turns out, you need to toss out the Union Association as well as the National Association and the 1884 American Association. Those leagues were too disorganized for AOWP+ to tell us anything. Either there was no set schedule, too many collapsing teams, or overextended with expansion causing the numbers for those leagues make no sense. For example, in 1875 Candy Cummings started almost every game for Hartford in the National Association for the first half of the year, lost his job, and thus ended up with one of the worst single season AOWP+s ever. All the worst teams stopped playing midway through the year, so his successor never faced them.

Finally, I should note that for three reasons I always used games started, never innings pitched even when Retrosheet’s splits made the latter available. First, it’s more consistent this way. Assuming that better teams have better offenses, and switching to innings would give more recent pitchers a disadvantage. Second, many pitchers pick up some innings in relief. Game situation determines the leveraging in relief, not the opponent.

Finally, this is an examination of how teams used their pitchers, not how pitchers took advantage of their opportunity. Games Started tells you the former, and Innings Pitched tells you how pitchers responded in those starts.

Though I haven’t checked AOWP+ for everyone, I’ve done it for such a large number of guys in the right period that I believe I know who the best- and worst-leveraged pitchers in baseball history really were. That’s the fun part.

The Fun Part: Results

No reason to dawdle. Here’s the best-leveraged pitchers ever, minimum 150 starts:

Name                    AOWP+    GS
1. Ken Heintzelman      105.94   183
2. Reb Russell          104.71   148
3. Mordecai Brown       104.45   332
4. Rube Walberg         104.42   306
5. Fritz Ostermueller   104.05   246
6. Johnny Klippstein    103.95   161
7. Clarence Mitchell    103.76   277
8. Don Mossi            103.72   165
9. Max Lanier           103.66   204
10. Billy Hoeft         103.63   200
11. Johnny Schmitz      103.49   235
12. Thornton Lee        103.35   163
13. Lloyd Brown         103.04   181
14. Ray Collins         102.92   151
15. Carl Hubbell        102.90   433
16. Billy Pierce        102.89   433
17t. Lefty Gomez        102.86   320
17t. Johnny Niggeling   102.86   161
19. Preacher Roe        102.70   261
20. Gerry Staley        102.66   186

OK, fine, Reb Russell shouldn’t really qualify with his measly 148 GS. (shrugs) Close enough… Three Hall of Famers, but not as good a list as I would’ve guessed. In part, that’s because it’s harder to score very high when you have 400 or more starts.

Still, there’s something else going on. Look at the top of the list, for instance. Who in God’s name is Ken Heintzelman? He’s a forgotten mediocrity from the 1940s. Not only is he the best-leveraged pitcher ever, but he’s far better than anyone else. The distance between first and second is about the same as the difference from second to eleventh. The heck?

Looking closer, Heintzelman was a swingman for the Pirates in the 1940s. After a trade, he performed that same role for the Phillies into the early 1950s, save for 1949 when he was a full-time starter. Let’s look closer at how his teams used him to figure this out.

Teams        Pitt    Phil    Total
Bos           16      17       33
Brk           21      12       33
Chc           10       9       19
Cin            2      18       20
NYG           14      12       26
Phil           0       0        0
Pit            5       5       10
StL           23      24       47
Total         86      97      183

Well, there’s something you don’t see every day. No starts against the Phillies ever. When Heintzelman broke into baseball, the Phils were in the midst of a stretch in which they finished dead last in runs scored and runs allowed four times in five years. They were the polar opposite of the McCarthy Yanks. Never facing them certainly boosted his AOWP+.

Also in his Pirate days, he almost never faced Cincinnati. Two of his first three starts came against them, and then none of his 83 remaining starts with Pittsburgh came against McKechnie’s squad. Weird. More remarkably, over one-fourth of his starts came against St. Louis. All they did was have a winning record every year Heintzelman had a start.

So… what the hell’s going on? Simple – the key lies with the Cards. Their two best hitters were Enos Slaughter and Stan Musial. Both left-handers. Heintzelman was a southpaw. Platoon leveraging, folks – that’s the secret.

McKechnie’s Reds had an overwhelmingly right-handed line up, so he never faced them, but he kept facing the Cards in what might be that storied franchise’s golden era. You’d expect him to face the Brooklyn Dodgers a lot less. The rarity of lefties facing them was extensive enough to delay Duke Snider‘s Cooperstown induction. However, (as I’ll discuss in a later article) pulling lefties against the Boys of Summer didn’t reallyoccur until the mid-1950s. In Heintzelman’s only full season as a starter, 1949, half of his 32 starts came against either the pennant-winning Dodgers or the runner-up Cards. In fact, the Phillies beat Brooklyn 7 times in Heintzelman’s 8 starts that year.

So platoon leveraging is the key to Heintzelman’s standing. How many other guys on that list were also lefties? Reb Russell in second place was. Brown wasn’t, but Walberg and Ostermueller both were. In fact, Ostermueller also loaded up on the Cards in the 1940s. Look, I’ll save everyone a lot of time and list the right-handed pitchers in the Top 20: Mordecai Brown, Johnny Klippstein, Johnny Niggeling, Gerry Staley, and . . oh, that’s it. Four. Out of twenty! Great googley-moogley, fourteen of the top sixteen were lefties. I’ll need to look at platoon leveraging more extensively in a later article.

Maybe it’s the Cub fan in me, but sometimes I’m even more interested in seeing the bottom of the list. Here are the 20 worst leveraged pitchers of all-time (including if they were lefties or righties) with a minimum 150 GS:

Name                    AOWP+    GS    L/R
1. Kid McGill           94.94    148   L 
2. Paul Minner          95.04    169   L
3. Bill Swift           95.34    165   R
4. Vinegar Bend Mizell  95.81    230   L
5. Joe Benz             96.67    163   R
6. Johnny Allen         96.90    241   R
7. Fred Goldsmith       97.08    187   R
8. Hank Wyse            97.16    159   R
9. Firpo Marberyy       97.19    186   R
10. Charlie Ferguson    97.22    170   R
11. Ray Kolp            97.25    172   R
12. Russ Meyer          97.32    219   R
13. Kaiser Wilhelm      97.36    157   R
14. Ray Collins         97.51    219   R
15. Hank O'Day          97.57    192   R
16. Jesse Haines        97.59    388   R
17. Joe Bowman          97.61    184   R
18t. Elam Vanglider     97.63    187   R
18t. Jack Warhop        97.63    150   R
20t. Joe Nuxhall        97.65    153   L
20t. Ted Lewis          97.65    287   R

Again, there’s a guy at 148 who technically doesn’t belong, but let’s not get too concerned with the details. I should point out that Leon Cadore (of 26-inning-game fame) had an AOWP+ of 97.32 in 147 starts, though. Also, the listed Bill Swift is the old Pirates, not the not-so-ancient Mariner.

Only four lefties. Three of them pitched in the 1950s NL. That’s the Brooklyn Dodger effect that Heintzelman and Ostermueller just missed. Kid McGill’s main claim to fame came in 1890 when the Cleveland Infants of the Players League made the not-quite-16-year-old the youngest person ever given a fulltime job as a starter. Six of McGill’s twenty starts that year came against the last place Buffalo squad. Next year in the tattered remains of the AA he had eight starts against its worst team. Please realize last place teams way-back-when lost over two-thirds of their games, and sometimes far more than two-thirds.

The names on this list aren’t as impressive as those on the top 20. That’s natural. Johnny Allen was a pretty good pitcher, but he had a reputation as one of the sorest losers in all baseball. No wonder his managers made sure he was put in games where he could win. The real big name is Jesse Haines, the only Hall of Famer. He’s routinely cited as one of the worst selections to Cooperstown. It turns out we’ve actually considerably overrated him when decrying his selection. In reality, he wasn’t nearly as good as we always thought he wasn’t. OK, that’s a little harsh. In reality, it was the end of his career that screwed him up.

In the last five years of his career he only started 43 games, but 18 of those came against teams with a winning percentage on the wrong side of .400. Meanwhile, he had only nine against teams with winning records; and four of those came in one year. His AOWP+ from 1933-7 was a chortle-inducing 85.87. His raw AOWP was .420, only slightly better than the Cubs were last year. Imagine a pitcher starting 43 straight games against the ’06 Cubs and how that would inflate his numbers.

Jesse Haines doesn’t have to imagine – he lived it. The manager who used him like this? Frankie Frisch, who later put him in Cooperstown. (shakes head). Even prior to 1933 Haines only had a career AOWP+ of 99.05. Aside from a few years in the early/mid 1920s, he never had good leveraging scores

Well, that’s all the highlights on the career leader boards. As for single-season marks – that’ll be in the next installment of what could easily end up being a 10+ article series here at The Hardball Times.

References & Resources
All data 1871-1956 compiled from the game logs at Retrosheet.

I have never read Dick Thompson’s book, and only know of his findings based on comments he’s made over at The Baseball Think Factory, but since it inspired this study, it deserves a mention: Thompson, Dick. The Ferrell Brothers of Baseball. New York: McFarland and Company, 2005. I should note, and will discuss at greater length later on, however, that according to AOWP+, Thompson was wildly off the mark in his statements on Wes Ferrell. Accounting for how his team’s leveraged him actually lowers his value.

As a kid I read in one of Bruce Nash and Allan Zullo’s Hall of Shame books that Johnny Allen had one of the worst tempers ever. No idea which book it came from, though.

