More inverted records

A while back, I reintroduced an old Bill James invention that translates a hitter’s line into a pitching line. It’s frivolous fun that has few, if any practical applications, but it helps pass the time between baseball seasons.

In response to the original article, readers offered many suggestions for further work. One suggestion was to focus on “average” players. That’s sort of the premise here, although we will wander around a bit because… well, why not.

Preliminaries

To create our sample, I first identified all batting title qualifiers in 2010 (n = 151). After running the translations, I then identified all pitchers who worked at least 106 innings (n = 139) and added them to the pool, giving us a total of 290 players. I chose 106 innings as the threshold because that represents the lowest total among translated batting lines (Chase Utley’s 2010 comes out to 106.1 IP) and it gives a number fairly similar to the number of hitters.

I then found the average translated pitching line of 151 batters:

                  IP   H  R ER HR BB  SO  ERA HR/9 BB/9 SO/9
Sample average 140.0 149 83 75 18 54 103 4.81 1.16 3.48 6.65

As fate would have it, there are two hitters and a pitcher that matched this ERA in 2010. Here is the average line along with the lines of those players:

                  IP   H  R ER HR BB  SO  ERA HR/9 BB/9  SO/9
Sample average 140.0 149 83 75 18 54 103 4.81 1.16 3.48  6.65
Jason Hammel   177.2 201 97 95 18 47 141 4.81 0.91 2.38  7.14
Adam LaRoche   142.1 146 84 76 25 48 172 4.81 1.58 3.04 10.88
Raul Ibanez    144.0 154 86 77 16 68 108 4.81 1.00 4.25  6.75

Note that this is higher than the 2010 average MLB ERA (4.08), which makes sense since we are drawing only from the hitters who qualified for the batting title (and who presumably are better than those who didn’t).

For grins, I also divided our sample of 151 hitters into 10 groups according to OPS+ (plus Cesar Izturis, who was so bad he gets his own group). Group A consisted of players in the top 10% (1-15), Group B the next 10% (16-31), and so on. Here are the translated lines for the “leader” of each group:

Grp Rnk OPS+ Player               IP   H   R  ER HR BB  SO  ERA HR/9 BB/9  SO/9
A     1 179  Miguel Cabrera    132.0 180 141 127 38 89  95 8.66 2.59 6.07  6.48
B    16 141  Adrian Beltre     144.1 189 110  99 28 40  82 6.17 1.75 2.49  5.11
C    31 130  Nick Swisher      141.0 163 103  93 29 58 139 5.94 1.85 3.70  8.87
D    46 122  Vladimir Guerrero 148.1 178  95  86 29 35  60 5.22 1.76 2.12  3.64
E    61 113  Stephen Drew      141.0 157  94  85 15 62 108 5.43 0.96 3.96  6.89
F    76 106  Adam LaRoche      142.1 146  84  76 25 48 172 4.81 1.58 3.04 10.88
G    91 102  Carlos Pena       132.0  95  72  65 28 87 158 4.43 1.91 5.93 10.77
H   106  95  Ben Zobrist       148.0 129  77  69 10 92 107 4.20 0.61 5.59  6.51
I   121  90  Derek Jeter       171.2 179  83  75 10 63 106 3.93 0.52 3.30  5.56
J   136  83  A.J. Pierzynski   125.0 128  50  45  9 15  39 3.24 0.65 1.08  2.81
Izt 151  50  Cesar Izturis     129.2 109  35  32  1 25  53 2.22 0.07 1.74  3.68

First off, mad props to the Orioles for letting Izturis qualify for the batting title. The gap between him and the second worst OPS+ among qualifiers (Alicides Escobar, 67) is greater than that between no. 59 (Mike Napoli, 113) and no. 104 (Starlin Castro, 97). Not every team would have the guts to stick a bat that bad out there every day, so way to go.

Cabrera’s line doesn’t compare with that of any pitcher in our sample, for the obvious reason that nobody (not even the Orioles) would let a guy work 106 innings while pitching like that. Izturis, on the other hand, matches up with the game’s elite pitchers:

                   IP   H  R ER HR BB  SO  ERA HR/9 BB/9 SO/9
Cesar Izturis   129.2 109 35 32  1 25  53 2.22 0.07 1.74 3.68
Felix Hernandez 249.2 194 80 63 17 70 232 2.27 0.61 2.52 8.36
Josh Johnson    183.2 155 51 47  7 48 186 2.30 0.34 2.35 9.11
Clay Buchholz   173.2 142 55 45  9 67 120 2.33 0.47 3.47 6.22

Close matches

The next thing I did was try to find roughly equivalent lines among hitters and pitchers. I approached this a number of different ways

My first attempt involved matching ERAs. There were many perfect matches, but here are the extremes. First the low end:

                    IP   H  R ER HR BB  SO  ERA HR/9 BB/9 SO/9
Chone Figgins    167.2 156 67 60  1 74 114 3.22 0.05 3.97 6.12
Alberto Callaspo 148.1 149 59 53 10 31  42 3.22 0.61 1.88 2.55
Chris Carpenter  235.0 214 99 84 21 63 179 3.22 0.80 2.41 6.86

And then the high end:

                IP   H   R  ER HR BB  SO  ERA HR/9 BB/9 SO/9
Nick Swisher 141.0 163 103  93 29 58 139 5.94 1.85 3.70 8.87
Scott Kazmir 150.0 158 103  99 25 79  93 5.94 1.50 4.74 5.58

Next I tried home runs. This ranged from several hitters and Brett Anderson at 6 to Joey Votto and Rodrigo Lopez at 37. Here is my favorite:

                   IP   H  R ER HR BB  SO  ERA HR/9 BB/9 SO/9
Michael Cuddyer 159.0 165 80 72 14 58  93 4.08 0.79 3.28 5.26
Gavin Floyd     187.1 199 92 85 14 58 151 4.08 0.67 2.79 7.25

This one is cool because Cuddyer and Floyd match not only in homers but also in ERA and walks. Cuddyer and Floyd also are your Mr. Average for 2010, both checking in with an ERA that matches MLB average.

Others of note are these three, who had the same number of homers in nearly the same number of innings:

A Hardball Times Update
Goodbye for now.
                      IP   H  R ER HR BB  SO  ERA HR/9 BB/9 SO/9
Jeff Francoeur     121.1 113 53 48 13 30  81 3.56 0.96 2.23 6.01
Hisanori Takahashi 122.0 116 51 49 13 43 114 3.61 0.96 3.17 8.41

Brandon Phillips   162.1 172 86 77 18 46  83 4.27 1.00 2.55 4.60
Rick Porcello      162.2 188 96 89 18 38  84 4.92 1.00 2.10 4.65

Mike Young         168.0 186 93 84 21 50 115 4.50 1.13 2.68 6.16
Chris Narveson     167.2 172 96 93 21 59 137 4.99 1.13 3.17 7.35

I also looked at perfect BB/9 matches. Low end:

                   IP   H  R ER HR BB  SO  ERA HR/9 BB/9 SO/9
Roy Halladay    250.2 231 74 68 24 30 219 2.44 0.86 1.08 7.86
A.J. Pierzynski 125.0 128 50 45  9 15  39 3.24 0.65 1.08 2.81

And high end:

                IP   H   R ER HR BB SO  ERA HR/9 BB/9 SO/9
Luke Scott   112.0 127  91 82 27 59 98 6.59 2.17 4.74 7.88
Scott Kazmir 150.0 158 103 99 25 79 93 5.94 1.50 4.74 5.58

How about SO/9? Sure, why not; again, from low to high:

                   IP   H  R   ER HR BB  SO  ERA HR/9 BB/9 SO/9
Carl Pavano     221.0 227  95  92 24 37 117 3.75 0.98 1.51 4.76
Shane Victorino 149.1 152  88  79 18 53  79 4.76 1.08 3.19 4.76

Bud Norris      153.2 151  94  84 18 77 158 4.92 1.05 4.51 9.25
Dan Uggla       145.0 169 115 104 33 78 149 6.46 2.05 4.84 9.25

A different approach

I eventually came up with a poor-man’s variant on similarity scores. Lacking the resources to do exactly what I wanted, I lined up the 151 batters and 139 pitchers side by side, in ascending order of ERA. This gave 139 matched pairs, from Izturis (2.22 ERA) and Hernandez (2.27) to Robinson Cano (6.63 ERA) and Ryan Rowland-Smith (6.75).

Ideally, I would have evaluated each player against all other players, but instead I compared values from one row for each matched pair. I looked at the rate statistics—ERA, HR/9, BB/9, and SO/9—and calculated the difference between each. Then I summed all four values in a couple of different ways (it’s a bit messy but it yields decent results; details can be found in References and Resources, for those interested in learning more and/or improving the method). I ran numbers for all 139 matched pairs and then sorted by lowest total differential. Here are the best matches:

                      IP   H   R  ER HR BB  SO  ERA HR/9 BB/9 SO/9
Shane Victorino    149.1 152  88  79 18 53  79 4.76 1.08 3.19 4.76
Jake Westbrook     202.2 203  99  95 20 68 128 4.22 0.89 3.02 5.68

Starlin Castro     118.0 139  62  56  3 29  71 4.27 0.23 2.21 5.42
Jeremy Guthrie     209.1 193  93  89 25 50 119 3.83 1.07 2.15 5.12

Jhonny Peralta     145.0 137  69  62 15 53 103 3.85 0.93 3.29 6.39
Jon Garland        200.0 176  86  77 20 87 136 3.46 0.90 3.92 6.12

Ryan Braun         151.1 188 113 102 25 56 105 6.07 1.49 3.33 6.24
A.J. Burnett       186.2 204 118 109 25 78 145 5.26 1.21 3.76 6.99

Gaby Sanchez       146.1 156  88  79 19 57 101 4.86 1.17 3.51 6.21
Wade LeBlanc       146.0 157  69  69 24 51 110 4.25 1.48 3.14 6.78

Robinson Cano      150.2 200 123 111 29 57  77 6.63 1.73 3.40 4.60
Ryan Rowland-Smith 109.1 141  94  82 25 44  49 6.75 2.06 3.62 4.03

Alex Gonzalez      157.0 149  74  67 23 31 118 3.84 1.32 1.78 6.76
Hiroki Kuroda      196.1 180  87  74 15 48 159 3.39 0.69 2.20 7.29

Alexis Rios        148.2 161  80  72 21 38  93 4.36 1.27 2.30 5.63
Derek Lowe         193.2 204  88  86 18 61 136 4.00 0.84 2.83 6.32

Nyjer Morgan       139.0 129  55  50  0 40  88 3.24 0.00 2.59 5.70
R.A. Dickey        174.1 165  62  55 13 42 104 2.84 0.67 2.17 5.37

Casey McGehee      154.1 174  93  84 23 50 102 4.90 1.34 2.92 5.95
Jeff Niemann       174.1 159  86  85 25 61 131 4.39 1.29 3.15 6.76

Franklin Gutierrez 150.2 139  66  59 12 50 137 3.52 0.72 2.99 8.18
Cole Hamels        208.2 185  74  71 26 61 211 3.06 1.12 2.63 9.10

Victor Martinez    122.0 149  82  74 20 40  52 5.46 1.48 2.95 3.84
John Lannan        143.1 175  82  74 14 49  71 4.65 0.88 3.08 4.46

Brennan Boesch     118.0 119  64  58 14 40  99 4.42 1.07 3.05 7.55
Ross Ohlendorf     108.1 106  54  49 12 44  79 4.07 1.00 3.66 6.56

Jeff Francoeur     121.1 113  53  48 13 30  81 3.56 0.96 2.23 6.01
Matt Cain          223.1 181  84  78 22 61 177 3.14 0.89 2.46 7.13

Again, it’s not perfect (and I am sure there are better ways to achieve the intended goal), but this gives us some idea of players with similarly shaped lines. Need a guy who serves up homers? Try Cano or Rowland-Smith. How about a control freak? You want Gonzalez or Kuroda. Maybe you’re more of a strikeout guy: Gutierrez or Hamels. And on it goes, limited only by your imagination. Have fun!

References & Resources
Thanks to readers of the original article for finding it interesting enough to inspire a sequel. One suggested idea that I didn’t cover here is that of head-to-head matchups. The samples would be too small to have meaning but it might be fun—in a Spock versus Evil Spock kind of way—to see, e.g., how Cuddyer fared against Floyd in 2010 (he went 4-for-11 with a double and two strikeouts, if you’re curious).

As for the messy bits mentioned above, I summed ERA, HR/9, BB/9, SO/9 in two ways. The first takes the absolute value of each component, while the second takes the absolute value of the sum:

Method 1: |ERA dif| + |HR/9 dif| + |BB/9 dif| + |SO/9 dif|
Method 2: |(ERA dif + HR/9 dif + BB/9 dif + SO/9 dif)|

For example:

                      IP   H   R  ER HR BB  SO  ERA HR/9 BB/9 SO/9
Robinson Cano      150.2 200 123 111 29 57  77 6.63 1.73 3.40 4.60
Ryan Rowland-Smith 109.1 141  94  82 25 44  49 6.75 2.06 3.62 4.03

Method 1: |6.75 - 6.63| + |2.06 - 1.73| + |3.62 - 3.40| + |4.03 - 4.60| = 1.24
Method 2: |(6.75 - 6.63) + (2.06 - 1.73) + (3.62 - 3.40) + (4.03 - 4.60)| = 0.10

There was a logic behind this at some point, but I forget what it was. What’s important is that using both together yields better results than using just one. If you have suggestions on how to improve this or if you have ideas for further study, please share them in the comments.


3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Matt
13 years ago

Izt group = hilarious. I don’t see why Carlos Lee and Chone Figgins get so much hate when we have an Izturis posting an OPS of 50.

Sabertooth
13 years ago

Interesting idea, but it matches good hitters with bad pitchers and good pitchers with bad hitters. 

What about a system designed to match the good with the good?  Start with the league average for the various reciprocal categories and then flip them.  For example, translate a desirably high strikeout rate for a pitcher into a desirably low strikeout rate for a hitter, based on their desirable deviations from league average.

Geoff Young
13 years ago

@Matt: Amazingly, Izturis’ 50 OPS+ is only the 17th worst among batting title qualifiers since 1961. Clint Barmes (47 in 2006) is the most recent to have a worse OPS+, while Matt Walbeck’s 37 in 1994 is the lowest since Art Scharein’s 34 in 1933. I guess Izturis still has work to do if he wants to be remembered with the immortals.

@Sabertooth: That would attempt to answer a different question but sounds interesting. Thanks for the suggestion.