Circle the Wagons: Running the Bases Part II

“No game in the world is as tidy and dramatically neat as baseball, with cause and effect, crime and punishment, motive and result, so cleanly defined.”
—Paul Gallico

In last week’s article I introduced a framework for evaluating baserunning in an attempt to determine the magnitude of the impact both good and bad baserunners have on their teams. In that article I used play-by-play data from Retrosheet for the five-year period from 2000 to 2004 to analyze three situations that I felt would likely reveal baserunning skill. These were:

  • Runner on first, second not occupied and the batter singles
  • Runner on first, second not occupied and the batter doubles
  • Runner on second, third not occupied and the batter singles
  • In these situations I took into account both the number of outs and the fielder to which the ball was hit. In crunching the numbers I then introduced the concept of Expected Bases (the number of bases a runner was expected to gain given his opportunities), Incremental Bases (the difference between the Expected Bases and the actual number of bases gained), and Incremental Base Percentage (the ratio of total bases gained to expected bases). What I found was that there is about a 30-base span between the best and worst baserunners and that the IBP typically ranges from 1.15 to .85. A couple of minor corrections to the first article can be found on my blog.

    Refinements

    As many readers immediately perceived, these raw numbers are not the end of the story. A number of readers e-mailed suggestions for refining the framework, which included factoring in singles stretched into doubles and doubles stretched into triples (the effect of third base coaches), trying to factor out the effects of the hit-and-run, and including the scenario where second is occupied with a runner on first when a batter singles. While these are all excellent ideas and would make the framework more accurate, they all have problems.

    Unfortunately, the play-by-play data from Retrosheet provides no ability to determine when a Juan Pierre stretches a hit rather than simply coasting into second or third. Nor does it allow for determining when a hit-and-run was on, something that would certainly affect the Cardinals’ statistics given Tony La Russa’s penchant for putting on the play.

    And while as a Cubs fan I’m well aware of the cost of a third-base coach, having endured “Waving” Wendall Kim in 2003 routinely sending runners to their doom, I can’t think of a good way to isolate the effect of third-base coaches. Information about who was coaching third in a given game is nonexistent, and coaches probably don’t move around enough to be able to generate comparisons of their effect on teams. In addition, an aggressive third-base coach will not only cause more runners to be thrown out but will also cause more bases to be gained, which may well cancel each other out. Finally, I did consider including the situation where second base was occupied but dropped the idea since I was concerned that the defense might make a play on the lead runner allowing the runner on first to take third for “free.”

    But while I chose not to pursue these refinements, there are a couple that are both doable and have an impact as James Click also discovered in his baserunning analysis published in the 2005 Baseball Prospectus. To introduce the first, take a look at the team leaders in IBP for each of the five years:

    Year Team         Opp   Bases      EB      IB     IBP      OA
    2000 COL          445     730     678      52    1.08       7
    2001 COL          379     628     582      46    1.08       8
    2002 COL          350     541     517      24    1.05      10
    2003 COL          413     661     629      32    1.05       7
    2004 COL          445     734     692      42    1.06      10
    

    In addition to the Rockies, Texas and St. Louis each appear in the top five three times, with the White Sox and Twins twice each, meaning that 15 of the 25 spots are occupied by the same five teams. On the other end of the spectrum, the Red Sox, Dodgers, Brewers, Phillies, Blue Jays, Giants and Astros all appear in the bottom five more than once. Enough said. What is going on is that parks play a role in determining how often runners advance and how many bases they gain when doing so. And that difference is largely based on the field to which the ball is hit and the surface of the park. For example, in looking at the scenario where there is a runner on first base and the batter singles, the following tables show the high and low totals for the five years when the ball is fielded by each of the three outfield positions.

    Single to Left      Opp   To3rd  Scores  OA      Pct
    High Fenway Park    369      72     2     5     .201
    Low Sky Dome        240      24     1     1     .104
    
    Single to Center    Opp   To3rd  Scores  OA      Pct
    High Coors Field    357     145     4     3     .417
    Low Jacobs Field    268      54     2     1     .209
    
    Single to Right     Opp   To3rd  Scores  OA      Pct
    High Wrigley Field  347     171     3     4     .501 
    Low Yankee Stadium  364     126     4     5     .357
    

    Here you can see that Fenway’s Green Monster has the effect of doubling the chances that a runner would move from first to third or score on a single to left, as opposed to the Sky Dome (Rogers Centre) where the AstroGrass—replaced with FieldTurf this season—and smaller dimensions ostensibly hold runners in check. The spacious center-field area at Coors Field allows twice as many runners to advance or score as Jacobs Field, while the right field well 353 feet from the plate at Wrigley allows runners to advanced to third or score 50% of the time. In Yankee Stadium at 318 feet in right field, runners advance to third or score just 36% of the time.

    In order to see how these effects played out overall, I calculated a park factor (IBP/PF) for each park for every season. My approach to doing so was similar to how Batter and Pitcher Park Factors (BPF, PPF) are calculated. First I calculated the aggregate IBP in all games played at each park and then did the same for all road games for the team that played in that park. So for example, the Rockies and their opponents recorded an IBP of 1.03 in Coors Field in 2003 and an IBP of .99 on the road. I then divided the home IBP by the road IBP, in this case 1.03/.99 = 1.04, and took half the difference since the team only plays half of its games at home. For Coors Field in 2003 the IBP/PF was calculated at 1.02. For the five seasons the Coors Field factors were:

    2000  1.03
    2001  1.03
    2002  1.02
    2003  1.02
    2004  1.01
    

    These are pretty consistent and show that Coors generally provides a 2-3% advantage to runners in gaining incremental bases. One might wonder if the quality of the home team’s outfield defense might skew these numbers, but since the same defense plays both at home and on the road the effect, if any, should be cancelled out.

    Once the park factors were calculated I took a look at their variability. The lowest IBP/PF was recorded in San Diego (Qualcomm Stadium) in 2002 at .96 and the highest was in Texas in 2003 at 1.04, so generally you can see that these park factors don’t vary as much as BPF and PPF, which can range from .90 to 1.10 or higher in the case of Coors Field. And even though some parks such as Coors Field and The Ballpark at Arlington show consistency there is also a good deal of variability in the season-to-season factors for a particular park. Running a simple regression (and excluding teams that moved parks such as the Padres, Reds, Brewers, Pirates and Phillies) showed that the correlation coefficient for the four pairs of seasons were positive three times but reached .3 only once, which is pretty weak. Clearly single-year park factors here should be taken with a grain of salt. As a result, I averaged the park factor for each park across the five years it was in use with the results below:

    Coors Field                          COL         1.02
    Ballpark at Arlington (Ameriquest)   TEX         1.02
    Bank One Ballpark                    ARI         1.02
    Royals Stadium                       KCA         1.01
    Stade Olympique                      MON         1.01
    Comerica Park                        DET         1.01
    Miller Park                          MIL         1.01
    Comiskey Park II                     CHA         1.01
    Enron Field                          HOU         1.01
    Wrigley Field                        CHN         1.01
    Shea Stadium                         NYN         1.01
    SBC Park                             SFN         1.01
    Network Associates Coliseum          OAK         1.01
    Tropicana Field                      TBA         1.00
    Citizen's Bank Park                  PHI         1.00
    Turner Field                         ATL         1.00
    Fenway Park II                       BOS         1.00
    Hubert H Humphrey Metrodome          MIN         1.00
    Minute Maid Park                     HOU         1.00
    Skydome (Rogers Centre)              TOR         1.00
    Safeco Field                         SEA         1.00
    Stade Olympique,Hiram Bithorn        MON         1.00
    Dodger Stadium                       LAN         1.00
    PNC Park                             PIT         0.99
    Edison International Field           ANA         0.99
    Pro Player Stadium                   FLO         0.99
    Jacobs Field                         CLE         0.99
    PacBell Park                         SFN         0.99
    Oriole Park at Camden Yards          BAL         0.99
    Great American Ball Park             CIN         0.99
    U.S. Cellular Field                  CHA         0.99
    Yankee Stadium II                    NYA         0.99
    Oakland Coliseum                     OAK         0.99
    Petco Park                           SDN         0.99
    Busch Stadium II                     SLN         0.99
    Veterans Stadium                     PHI         0.99
    Qualcomm Stadium                     SDN         0.99
    Cinergy Field                        CIN         0.98
    County Stadium                       MIL         0.97
    Three Rivers Stadium                 PIT         0.97
    

    This further shrinks the variability to where there is only a 5% swing between the best and worst parks. Because of the small spread, the argument can be made that characteristics that allow runners to advance more frequently in a ballpark are cancelled by other characteristics that don’t resulting in most parks drifting back towards the middle. If that’s the case, what we really need to do is calculate park factors for each scenario or at least each field to which the ball is hit. That makes sense, but it will have to wait until another day.

    Using these park factors I then went back and recalculated the IBP for each player in each season and then re-ranked the players for the last five years. The top 15 (actually 17 with ties) are listed below where Bases+ is the number of bases adjusted for park factor and IBP+ is the adjusted IBP.

    Name                   Opp     Bases   EB      IB      Bases+  IBP     IBP+
    Damian Jackson         110     184     160      24     184    1.15    1.15
    Jack Wilson            117     203     179      24     204    1.14    1.14
    Mike Cameron           168     286     252      34     287    1.14    1.14
    Raul Mondesi           120     201     179      22     203    1.12    1.13
    Miguel Cairo           105     173     153      20     173    1.13    1.13
    Chris Singleton        112     192     171      21     192    1.12    1.12
    Brian Jordan           114     196     175      21     196    1.12    1.12
    Juan Pierre            259     411     367      44     409    1.12    1.11
    Vernon Wells           111     186     167      19     186    1.11    1.11
    David Eckstein         216     352     319      33     355    1.10    1.11
    Barry Larkin           140     237     216      21     240    1.10    1.11
    Alfonso Soriano        136     216     196      20     217    1.10    1.11
    Jimmy Rollins          180     297     269      28     299    1.10    1.11
    Torii Hunter           170     279     252      27     279    1.11    1.11
    Cristian Guzman        209     349     315      34     349    1.11    1.11
    Ray Durham             203     334     301      33     333    1.11    1.11
    Luis Castillo          272     450     410      40     454    1.10    1.11
    

    As you can see the IBP+ values are very similar to the unadjusted totals. (You’ll also note that Damian Jackson, who was inadvertantly excluded from the list in my first article now takes the top spot from Jack Wilson by percentage points.) Those players who were helped most by their park were:

    Name                   Opp     Bases   EB      IB      Bases+  IBP     IBP+
    Michael Young          148     238     223      15     232    1.07    1.04
    Hank Blalock           101     136     152     -16     132    0.90    0.87
    Larry Walker           173     288     263      25     282    1.10    1.07
    Todd Helton            230     384     361      23     376    1.06    1.04
    Jeff Cirillo           147     238     223      15     234    1.07    1.05
    

    As you might have expected, we have two Rangers and three Rockies, all of whom gained from four to eight bases because of their parks. In the case of Larry Walker it drops him from 24th to 39th on the list of 210 players with 100 or more opportunities over the given years. You can find a complete list of the 210 players on my blog. The players who were most hurt by their parks include:

    A Hardball Times Update
    Goodbye for now.
    Name                   Opp     Bases   EB      IB      Bases+  IBP     IBP+
    Hideki Matsui          107     173     168       5     175    1.03    1.04
    Ryan Klesko            187     292     282      10     296    1.04    1.05
    Marquis Grissom        149     239     225      14     243    1.06    1.08
    Aaron Boone            110     169     161       8     172    1.05    1.07
    Brian Giles            199     321     302      19     327    1.06    1.08
    

    Once again, it’s not surprising that two Padres are on the list given that both Qualcomm and PETCO had IBP/PFs under 1.0 and that the magnitude of the effect is only around four bases over the five seasons.

    When applied at the team level, the leaders and trailers for the past five seasons are:

                         Opp   Bases      EB      IB     IBP      OA  IBP/PF    IBP+
    Leaders
    2000    MIL          329     525     510      15    1.03       7    0.97    1.06
    2001    SLN          329     511     490      21    1.04       9    0.98    1.07
    2002    MIN          366     559     551       8    1.01       8    0.97    1.05
    2003    OAK          401     641     617      24    1.04      11    0.98    1.06
    2004    SLN          447     711     677      34    1.05       6    0.99    1.06
    
    Trailers
    2000    CHN          346     502     531     -29    0.95      12    1.00    0.94
    2001    MIL          311     436     469     -33    0.93      13    1.03    0.90
    2002    ARI          340     489     504     -15    0.97      11    1.04    0.94
    2003    MIL          383     555     578     -23    0.96      13    1.03    0.93
    2004    BOS          484     713     756     -43    0.94      12    0.99    0.95
    

    It is interesting that the 2000 Brewers led the league while the 2001 edition was in the cellar. A quick look reveals that in 2000 Ron Belliard and Marquis Grissom accounted for 20 incremental bases all by themselves in 91 opportunities, with James Mouton adding over five more in just 10 opportunities. In 2001 with Grissom gone, his replacement Devon White added just over three incremental bases, while Belliard accounted for just two and Jose Hernandez along with Jeremy Burnitz were dinged for -14 thereby driving them to the bottom.

    On Deck

    The second refinement is to take my measures of Incremental Bases (IB) and Incremental Base Percentage (IBP) and convert these into runs gained or lost. That will be the subject of next week’s article.


    Comments are closed.