Egalitarianism and the RBIby Dan Fox
October 03, 2005
"It is better to be prepared for an opportunity and not have one, than to have an opportunity and not be prepared."
- Whitney Young, Jr.
As I was watching the Cubs sputter towards their .500 finish last week for multifarious reasons too many and too disturbing to discuss, WGN-TV's Len Kaspar noted that STATS, Inc. keeps track of which players bat with the most runners on base during the season. He seemed almost surprised to note that those players with the most opportunities tended to amass lots of RBIs and therefore win the RBI crown. In other words, driving in runners is related more to opportunity than to skill.
The RBI is therefore among the most non-egalitarian of the counting statistics, and it is for this reason that those interested in performance analysis have traditionally eschewed it as a means of measuring a player's value.
Be that as it may, because of his tone and because RBIs always play a big role in MVP voting and we're coming to that time of year, I thought this week it would be educational to review just how much more often some hitters get to hit in RBI situations and perform a simple test to judge whether RBIs are purely a result of opportunity.
Setting a BaselineAt first thought where a hitter hits in the lineup should have a large impact on how many RBI opportunities and therefore how many RBIs a player gets. To see what's typical I broke the 2004 season down by lineup position and calculated the number of plate appearances at each position, along with the number of times there were runners on, runners in scoring position, as well as runners on each base along with the percentage of plate appearances for each. The results can be summarized in the following table and graph.
POS PA Rrs/PA RISP/PA 1st/PA 2nd/PA 3rd/PA RBI/PA 1 20741 0.339 0.213 0.215 0.170 0.085 0.087 2 20091 0.433 0.240 0.290 0.186 0.093 0.102 3 19321 0.478 0.279 0.317 0.207 0.112 0.141 4 18653 0.504 0.304 0.341 0.230 0.121 0.151 5 18519 0.468 0.290 0.340 0.225 0.121 0.139 6 18273 0.463 0.272 0.336 0.214 0.115 0.126 7 17819 0.468 0.281 0.328 0.221 0.114 0.118 8 17299 0.463 0.278 0.326 0.216 0.116 0.112 9 16637 0.462 0.276 0.329 0.214 0.116 0.084
As you can see from the graph, almost all of the curves follow the same general slope. The leadoff hitter has the lowest percentage (both because they leadoff games and because hitters lower in the order don't get on as often), with a quick up tick that continues through the cleanup hitter before trailing off more slowly through the rest of the order. That slower decent can be attributed to the fact that 3rd through 5th place hitters often have the highest OBP as a group and therefore are on base for the remainder of the order.
Interestingly, the curves that buck that trend are for runners on first or third, where the 9th place hitter actually comes to the plate slightly more often in that scenario than 6th through 8th place hitters. It's also interesting that the 7th hole bats more often with a runner in scoring position than the does the sixth hitter but less often with a runner on third than the 6th, 8th, and 9th place hitters.
Also, you'll note that the spread of the differences at the various lineup positions differs as well. The largest standard deviation is for the percentage of time a lineup position hits with one or more runners are on base (.044), because the 1st and 2nd hitters are so far below the rest, followed by the standard deviation in the opportunities to hit with a runner at first (.038), runners in scoring position (.026), a runner on second (.018), and finally a runner on third (.012). You could expect the spreads to increase if we were looking only at the National League, where the impact of the pitcher hitting 9th would mean fewer opportunities for the top of the order.
I, for one, was a bit surprised that batters hit with runners on 45.3% of the time and hit with runners in scoring position fully 27% of the time. This accords with research Tom Ruane did some time ago on situational hitting.
The PlayersSo let's take a look at which players, at least those who garnered 150 plate appearances or more, hit with the most runners on in 2004 and how often they did so.
First, here are the leaders in opportunities hitting with men on base.
PA Empty Runners On Rrs/PA RBI Miguel Tejada 725 338 387 0.534 150 Brian Giles 711 333 378 0.532 94 Vinny Castilla 648 271 377 0.582 131 Randy Winn 703 338 365 0.519 81 Miguel Cabrera 685 331 354 0.517 112
So Miguel Tejada led the majors both in RBIs and number of times hitting with runners on base and so, at least in 2004, opportunity equaled result.
This season Miguel Cabrera leads in at-bats (not plate appearances) with runners on base with 325 to go with his 112 RBIs, while Hideki Matsui is a close second with 314 at-bats and 112 RBIs. Meanwhile, the Major League leader in RBIs with 143, David Ortiz, is 15th on the list with 279 at-bats.
Cabrera, of course, has the advantage of having Juan Pierre (.374 OBP in 2004) and Luis Castillo (.373 OBP in 2004) at the top of the order.
And here are those who led in percentage of times hitting with men on base in 2004.
PA Empty Runners On Rrs/PA RBI Vinny Castilla 648 271 377 0.582 131 Chipper Jones 567 248 319 0.563 96 Ramon Hernandez 432 195 237 0.549 63 Jeff Kent 606 274 332 0.548 107 Phil Nevin 623 284 339 0.544 105
As you can see a Rockies player (Vinny Castilla) is prominent in both lists due to Todd Helton (.469 OBP) hitting in front of him and Coors Field in general. The inclusion of Brian Giles, Ramon Hernandez, and Phil Nevin) can likely be attributed to the high OBPs of Nevin (.368), Mark Loretta (.391), Ryan Klesko (.399), Giles (.374), and even Sean Burroughs (.348) and Khalil Greene (.349). And of course Winn makes the list since he hit second in 542 plate appearances behind Ichiro Suzuki and his 262 hits.
The leaders generally come to the plate with runners on over 55% of the time, about 5% more than the average for cleanup hitters. And as you would imagine each of these players, with one notable exception, primarily hit cleanup.
PA 4th 5th Castilla 572 39 Kent 302 291 Jones 485 73 Nevin 606 13
The exception is Ramon Hernandez who hit 3rd (1 PA), 6th (40 PA), 7th (276 PA), 8th (112 PA), and 9th (3 PA) in 2004. Getting that many opportunities when hitting that low in order seems pretty remarkable, especially since in the 7th hole he was typically preceded by Jay Payton or Terrance Long. Klesko or Giles, however, often filled the 5th hole.
Next let's look at plate appearances with runners in scoring position in both total opportunities and by percentage for 2004.
PA RISP RISP/PA RBI Miguel Tejada 725 244 0.337 150 Miguel Cabrera 685 238 0.347 112 Vinny Castilla 648 234 0.361 131 Gary Sheffield 684 215 0.314 121 Lance Berkman 687 213 0.310 106 PA RISP RISP/PA RBI Vinny Castilla 648 234 0.361 131 Jacob Cruz 167 60 0.359 28 Grady Sizemore 159 56 0.352 24 Chase Utley 287 101 0.352 57 Miguel Cabrera 685 238 0.347 112Once again Tejada batted the most often with runners in scoring position, with the leaders hitting in that situation around 35% of the time, 5% above the average for a cleanup hitter.
This season Alex Rodriguez leads in at-bats with runners in scoring position with 184 to go with his 127 RBIs while Andruw Jones is tops in the NL with 182 at-bats. Jones is leading the league with 128 RBIs despite hitting just .209 in those situations, with just 9 of his major league-leading 51 home runs coming with runners in scoring position (and just one grand slam). Somehow he's managed to drive in 72 runs with just 38 hits. By comparison Derrek Lee has 41 hits and 11 homeruns in 124 at-bats with runners in scoring position, but has driven in just 63 runs.
And not surprisingly leadoff hitters like Ichiro Suzuki at 20.3% and Juan Pierre at 19% are near the bottom.
The inclusion of Utley, Sizemore, and Cruz on this list are particularly interesting and probably attributable more to good fortune than good surrounding hitters, since they had fewer plate appearances and batted all over the lineup last year. Their plate appearances by lineup position in 2004 were:
Utley Sizemore Cruz 1 10 17 4 2 47 0 4 3 20 2 3 4 0 0 5 5 50 0 13 6 65 0 31 7 31 13 48 8 33 51 6 9 31 79 53Finally, for comparison let's take a quick look at the leaders in number of total bases gained when runners were in scoring position in 2004.
RISP PA RBI AB TB SLUG Miguel Tejada 725 150 208 116 0.558 Scott Rolen 593 124 151 112 0.742 Manny Ramirez 663 130 156 102 0.654 Mark Teixeira 625 112 137 100 0.730 Vinny Castilla 648 131 203 97 0.478It's a clean sweep for Tejada and notably, the NL leader in RBIs Scott Rolen is also included. The difference of course in this last category is that the first two record purely opportunity while this one also records how a player took advantage of that opportunity.
This season both Mark Teixiera and Manny Ramirez have 114 total bases with runners in scoring position to go with 139 and 136 RBIs respectively, despite neither being in the top 20 for at-bats.
Wrapping it UpSo what do numbers like these tell us about the egalitarian nature of the RBI? Is all that's required the opportunity?
First, it should be noted that the curve shown above for plate appearances with runners in scoring position indicates that the 5th through 9th place hitters come to the plate only 2 to 3% less often with runners in scoring position than the cleanup hitter. That equates to 20 or so plate appearances per team per year. To me that indicates that while it's unlikely for a leadoff or second place hitter to lead the league in RBIs, hitters other than the 3rd and 4th place hitters at least have a fighting chance.
Second, I ran some simple correlations on the 2004 data between the raw number of opportunities in each category, as well as a few more, and the number of RBIs. Below are the correlation coefficients that measure the linear relationship between total RBIs and the category in question.
RISP TB 0.956 Total Bases 0.928 RISP PA 0.912 Runners On PA 0.910 Total Runners 0.910 RISP H 0.902 Runner on 1st 0.893 Runner on 3rd 0.893 Runner on 2nd 0.892 PA 0.836As you can see all the correlations are extremely high (generally anything over .7 is an indication that there is a causal relationship in play) but the highest is for the number of total bases gained with runners in scoring position. In other words, the best predictor of the number of runs a player drives in is the number of total bases he gains when hitting with runners in scoring position (.956). But of course before you can cash in those total bases you need to first get a modicum of opportunities, as indicated by the fact that the number of plate appearances with runners in scoring position is not far behind (.912).
So while the RBI remains the least egalitarian of the counting statistics, those who take advantage of their opportunities tend to come out on top.
References and Resources
For more fun with RBIs take a look at the following studies...
Situational Hitting by Tom Ruane
RBI Production - A New Look at an Old Statby Tom Ruane
RBIs, Opportunities and Power Hitting by Cyril Morong
The Major League RBI Equivalency Formula:The Dominance of Power Hitting by Cyril Morong
Dan is the author of the blog Dan Agonistes and welcomes your comments and suggestions via email.