Doing the math on outfield defense

Before tackling this article, you will want to read my previous piece on different fielding systems. Today, I’m going to present the annual rankings of each system discussed earlier, then lay out the formulas for calculating outfield Defensive Regression Analysis.

One big problem with new fielding metrics is that either the data or the methodology is proprietary. (An example of the latter is Clay Davenport’s Fielding Translations, posted on the Baseball Prospectus website.) Analysts Shane Jensen, David Pinto and Mitchel Lichtman can’t simply post the underlying STATS or BIS data they use. So it’s not possible for others literally to replicate every phase of this study. However, Mitchel’s, David’s and Shane’s results—which they’ve reviewed before publication of this article—were calculated independently and will be presented below in their original form.

If you’re tired of hearing about DRA (which is based solely on traditional, publicly available pitching and fielding statistics) without seeing any of the formulas, your patience will be rewarded. I’ll show you how you can calculate the DRA outfielder ratings so you can verify the match I’ve shown here.

Here are the SAFE and UZR ratings in the format I received them from Shane and Mitchel. Ratings are shown only for full-time seasons. The SAFE rating is estimated runs saved per 150 games; the UZR rating is actual estimated runs saved.

                                2003            2004            2005
Outfielder         Pos      SAFE    UZR     SAFE    UZR     SAFE    UZR
Adam Dunn          L                         -14      -6     -11      -6
Andruw Jones       C          15      -2       9      -4       9      -1
Bobby Abreu        R          -3       8      -9       2     -24      -1
Brian Giles        R                           5      24       0      20
Carlos Lee         L           3       7       4       5      -1     -19
Carl Crawford      L          26       8                      11      12
Carlos Beltran     C           9      31                      -2       2
Gary Sheffield     R          -3      -4     -10      -9     -32     -16
Ichiro Suzuki      R          17      15      10       0       2       3
Jermaine Dye       R                           5       7      -1       7
Jim Edmonds        C                          -8     -11      -1      -3
Johnny Damon       C           0      12      -9       0      -7     -10
Jose Cruz          R           7      13       2       2
J. Encarnacion     R          -4      14                     -14       9
Juan Pierre        C           0       9       6       4      -3       3
Luis Gonzalez      L           1      -6                       5      -4
Manny Ramirez      L                         -22     -11     -26     -47
Mark Kotsay        C                          -3       3      -7      -8
Marquis Grissom    C          -6      -6      -9      -8
Mike Cameron       C          26      29      -7      -8
Moises Alou        L         -10       1       0       4
Pat Burrell        L           0      -2                     -10      -4
Shawn Green        R          -2      -2                      -8     -13
Vernon Wells       C          -8       5      10      15       3      13

Here are the net plays calculated by David and Mitchel for S-PMR and S-UZR:

                             2003            2004            2005
                           net plays       net plays       net plays
Outfielder        Pos     PMR     UZR     PMR     UZR     PMR     UZR
Adam Dunn         L                       -28      -7      -9      -2
Andruw Jones      C        35      13       6      -8      25     -11
Bobby Abreu       R         7       8     -27       0      10      -4
Brian Giles       R                       -16      14      -4      27
Carlos Lee        L        -5      13      -4       3      17      -9
Carl Crawford     L        26      12                      60      17
Carlos Beltran    C        18      18                      -1      11
G. Sheffield      R        -3      -5     -29     -19     -10     -22
Ichiro Suzuki     R        31       7      14       0      32      11
Jermaine Dye      R                         1      16       1       2
Jim Edmonds       C                         0     -11      17       8
J. Damon          C        24      19     -33      -8     -10     -11
Jose Cruz         R        16      13       3      -8
J Encarnacion     R         5      21                       9      12
Juan Pierre       C        10       5     -15      -8       2       1
Luis Gonzalez     L         5       8                      27      11
M. Ramirez        L                       -31     -26     -24     -65
Mark Kotsay       C                        -3      -4       5     -37
M. Grissom        C        -5      -3     -34     -10
M. Cameron        C        60      37       1       7
Moises Alou       L         2       2     -15       3
Pat Burrell       L        14      -3                       3      -2
Shawn Green       R       -11     -16                      14       2
Vernon Wells      C         7      11     -12       3     -15       6

Here are the DRA runs saved numbers for each full-time season:

                          2003    2004    2005
Outfielder         Pos     DRA     DRA     DRA
Adam Dunn          L               -17     -29
Andruw Jones       C        28       7     -10
Bobby Abreu        R         4      -5     -10
Brian Giles        R                -1       6
Carlos Lee         L         9       2      10
Carl Crawford      L        18              11
Carlos Beltran     C         5              14
Gary Sheffield     R         4      -7      -4
Ichiro Suzuki      R         9       6      18
Jermaine Dye       R                -6      -4
Jim Edmonds        C                 5      21
Johnny Damon       C         2      14       9
Jose Cruz          R        12     -10
J. Encarnacion     R        -4             -16
Juan Pierre        C        -4     -14     -18
Luis Gonzalez      L        -2              -4
Manny Ramirez      L               -14     -30
Mark Kotsay        C                 9      -6
Marquis Grissom    C         2      -1
Mike Cameron       C        44      13
Moises Alou        L        -7      -9
Pat Burrell        L        -6             -19
Shawn Green        R        -2               0
Vernon Wells       C       -16       0     -16

Defensive Regression Analysis formulas for the outfield

The American League outfielder DRA formulas for 1993-2006 are relatively simple. (Coors has a horrific impact that complicates the National League formulas.) Let’s start with right field. DRA first calculates the runs saved by a team at a position. In right field, this is .59 runs saved/allowed per right field putout by the team above/below what the league-average team would record, given the same context as the team in question, or

					 .59*PO9, where

		PO9	= 	PO9bip   +  .185*GOEbip    –   .012*LBIPbip

It’s not as bad as it looks. The key is that all of the variables on the right are centered by reference to league average, given the “‘denominator” of relevant opportunities, indicated by the lower-case letters.

PO9bip is the number of right field putouts recorded by the team above or below the league average rate that year, given the number of BIP (BFP – BB – HBP – SO – HR) allowed by the team’s pitchers that year, or

		PO9bip  =  Team PO9  – Team BIP * (League PO9 / League BIP).	

The mysterious GOEbip is team groundouts (basically, infield assists minus double plays and caught stealing, with a minor adjustment for unassisted ground ball putouts at first base (write me if you want that detail) and infielder errors (collectively, “GOE”) above or below the league average rate, given team BIP, or

		GOEbip  =  tmGOE  – tmBIP * (lgGOE / lgBIP).

If groundouts and infield errors are above average given total BIP, then you expect fewer flyballs, and add credit to the outfielders; if groundouts and infield errors are below average given BIP, you can expect more flyballs, and add the negative number, which charges outfielders. So if a team has 100 more GOE than the league average team would have, given total BIP, you add 18.5 plays to the PO9bip total; if a team has 100 fewer GOE than the league average team would have, given total BIP, you add negative, i.e., subtract, 18.5 plays.

LBIPbip are BIP allowed by lefthanded pitchers above or below the league rate, given total BIP. You just plug LBIP where GOE or PO9 are in the above equations. If a team’s lefthanded pitchers allowed 100 more BIP, given total (right- and lefthanded pitcher) BIP, than the league-average team would have, you subtract 12 plays, etc.

In summary, the first factor in the formula takes into consideration the total number of BIP, the second the groundball/flyball tendency of the pitchers, the third the amount of lefthanded pitching. The ideas of taking into account total BIP and lefthanded pitchers’ BIP have been widely applied before; the idea of adjusted outfielder flyball opportunities based on infielder ground ball chances, given total BIP, is genuinely new.

Other non-zone systems use flyouts to estimate groundball/flyball tendencies of the team’s pitchers for purposes of adjusting oufielder ratings, but that sort of adjustment is self-referential. The DRA method avoids self-referentiality.

An individual rightfielder’s rating is the team rating, pro-rated for his share of innings, adjusted for the number of putouts he had above or below the team rate:

A Hardball Times Update
Goodbye for now.
Player RF runs = .59*PO9*(plyrIP / tmIP) + .59*[plyrPO9 − plyrIP*(tmPO9/ tmIP)].

At the other outfield positions two other factors come into play, the relative number of home runs allowed by the team’s pitchers, and the relative number of estimated flyouts caught by the team’s infielders (infield flyouts, initially estimated as infield putouts minus team assists (both infield and outfield)). Runs saved in left and center are

				.61*PO7   + .58*PO8, where

PO7	= 	PO7bip   +  .118*GOEbip    +   .006*LBIPbip  –  .256*HRbh 

PO8	= 	PO8bip   +  .270*GOEbip   –   .007*LBIPbip  –  .171*HRbh  +  .133*IFObip 

HRbh are home runs allowed by the team’s pitchers above or below the league average rate, given balls hit (AB – SO), or BH. IFObip are estimated infield flyouts (“IFO”) above or below league average given BIP. Sometimes outfielders such as Andruw Jones play shallow and take away discretionary pop-up chances by infielders. If that happens, IFObip are negative and the centerfielder’s rating is reduced. The formula for individual left- and centerfielder ratings works the same as the “Player RF Runs” formula above.

Runs saved in the National League 1993-2006 at left, center and right are:

				.65*PO7  + .58*PO8  + .49*PO9

The National League formulas for 1993-2006 are complicated by Coors Field, for which an adjustment is needed to make the overall system work, for outfielders as well for infielders and pitchers. (None of the 24 players in the sample played at Coors, so the fact that a Coors park adjustment is made for DRA while none is made for S-UZR or S-PMR doesn’t matter.)

Basically, Coors suppresses strikeouts and increases walks and home runs, and all three factors flow through the fielding equations. Consider these Coors-driven adjustments as a nuisance that fortunately should only apply for 1993-2004 in the National League; Coors recently has played more normally.

First one has to adjust the relative SO, BB and HR allowed by the team’s pitchers for the Coors effect:

SO = SObfp + 1.8215*CGip , BB = BBbfp – .6961*CGip , HR = HRbh – .4322*CGip.

SObfp = tmSO − tmBFP * ( lgSO / lgBFP )

BBbfp = tm(BB+HBP) – tmBFP * ( lg(BB + HBP) / lgBFP )

HRbh = tmHR – tmBH * ( lgHR / lgBH )

CGip = tmCoors Games – lgCoors Games*( tmIP / lgIP ).

IBB excluded from BB as well as BFP (and BH and BIP).

So the bold italicized SO, BB, and HR represent relative strikeouts, walks and home runs allowed, taking into consideration batters faced and Coors. These make their appearance in the National League outfielder formulas, as well as the Coors factor itself:

PO7 = PO7bip + .1423*GOEbip + .0138*LBIPbip + .1194*SO + .1033*BB + .2220*HR.

PO8 = PO8bip + .2510*GOEbip – .0100*LBIPbip + .1846*IFObip + .2904*CGip + .1879*SO + .2819*BB.

PO9 = PO9bip + .2580*GOEbip – .0114*LBIPbip + .2439*IFObip + .2823*CGip + .1488*SO + .2222*BB
+ .2220*HR.

Individual player ratings are calculated in the manner shown above for the American League in right field.

That’s basically it. Intrepid readers wishing to replicate the DRA ratings here should feel free to e-mail me for tips on data collection and the intricacies of adjusting GOE and IFO for unassisted groundout putouts at first, though this detail probably has little effect on outfielder ratings.


Comments are closed.