Confessions of an RBI Fanatic
by John WalshAugust 07, 2006
I have a confession to make: I'm a big fan of the RBI. That may not sound so bad to many of you, but think about my position. Here I am writing for an "analysis-oriented" baseball website. Not only that, but my own writing is often supported by quite a bit of statistical analysis—I definitely consider myself squarely in the sabermetric camp. That's why I'm a bit nervous about revealing my love for the run-batted-in. See, the RBI doesn't have many fans among the sabermetrically-minded. Just consider that when citing a player's key offensive stats we no longer write something like .300-30-100 (average, HR, RBI), but rather numbers like this: .300/.350/.500 line (average, on-base percentage, slugging percentage). Indeed, my Baseball Prospectus 2004 Annual does not even report RBI for players. We just don't have much use for the RBI anymore, it seems. That's why it's with some trepidation that I admit my penchant for the RBI.
Oh, I realize the shortcomings of the statistic. I know that the RBI is not a good stat for evaluating hitting ability. It's too dependent on things beyond a player's control: the ability of his teammates to get on base, his position in the batting order, things like that. But, you know, a solid base hit that scores a runner in a close game—well, it's exciting, dammit. It's the play that gets you to clench your fist and yell "Yeah!" (even if you're alone watching on TV). A key RBI in a tight spot is often the high point of a close ballgame.
I believe the backlash that the RBI has received in the analysis community arises from the general public's overrating of the stat. It's true that some mediocre players have racked up some pretty impressive RBI totals. And likewise, often the best hitters will not be found among the league leaders in RBI. Ok, that's fine: the RBI is not a particularly good measure of batting abilty. But, it is one of the best things to see on a ball field, so it merits some attention.
The Players' Opinion
The RBI has always been held in high esteem by players, managers and coaches. Hank Greenberg, one of the top RBI men of his generation, recounted to Lawrence Ritter his view of the RBI. From the incomparable The Glory of Their Times:
I've always believed that the most important aspect of hitting was driving in runs. Runs batted in are more important than batting average, more important than home runs, more important than anything. That's what wins ball games: driving runs across the plate.Of course, Greenberg may have been biased; it's natural to put a high value on the thing at which you excel. I mean, if the question, "What is the most important aspect of hitting?" is put to, say, Luis Castillo, he might not give you the same answer as Greenberg. Still, I think it's probably safe to say that the ability to drive in runs is considered very important by the majority of players, managers and coaches. I could be wrong on that, since I don't have first-hand knowledge, but I don't think so.
Charlie Gehringer used to bat ahead of me, and if we had a man on first base and Charlie was up, I'd yell, "Get him to third, Charlie, just get him to third. I'll get him in." That was my goal: get that man in.
Making Sense of RBI Totals
One problem with the RBI statistic is that it doesn't take into account opportunity. Generally, a batter hitting third or fourth in the batting order will have many more chances to drive in runs than a leadoff man. Likewise, a hitter batting behind a couple of high-OBP players will have more RBI opportunities than somebody batting behind Corey Patterson and his .290 OBP. To determine the RBI opportunities of a player, you need to turn to the play-by-play data.
I've devised my own method for determining the best RBI men in baseball. The idea is to compare how a player does in his RBI opportunities compared to the major league average. If he drives in more than the average player would, given the same opportunities, he gets "plus" RBI credits. If he drives in fewer, he gets "minus" credits. This "plus/minus" system is nice in that it gives you an immediate feel for how many extra RBI a particular hitter has produced.
Before I start laying the tables on you, we need to hash through a few details. Don't worry, this will be short and sweet. First off, I'm only interested in base runners driven in, i.e., I'm taking home runs out of the equation. Home run totals are readily available, so we know how often a player drives himself in. Next point: what is an RBI opportunity? To me, it's a plate appearance with runners on base. That's fairly simple, or is it? Should we count plate appearances when the batter was walked? What about if he sacrificed the runner along? Well, I'm going to count all plate appearances except intentional walks. I realize that's going to penalize selective hitters somewhat, but let's face it: selective hitters sometimes walk down to first base instead of driving in runs. Actually, many hitters will take fewer walks in an RBI situation. This is often a conscious decision on the part of the hitter, as evidenced by this quote from Red Sox first baseman, Kevin Youkilis:
Sometimes, you know, in certain situations you have to be more aggressive. With runners in scoring position, you've got to be more aggressive and be ready to hit because you want to drive in the runs; you don't want to walk.
If it's like second and third, one out, you get a pitch to hit, you've got to hit it. That's a big thing. You get a good pitch to hit in that situation, you've got to hit it. You can't go up there in that situation taking a lot of pitches.
I also take into account where the runners are in any given RBI oppurtunity. Obviously, it's harder to drive a runner in from first base than from third base. Also, when there is a runner on third base, it's much easier to drive him home with fewer than two outs: that's because an out, if it's not a strikeout or a pop fly, will often score a runner from third with fewer than two outs. So, I take into account the number of outs for runners on third base. You might be wondering how often a runner is driven in from the different bases. Here you go:
Fraction of Runners Driven In Base Occupied Fraction Driven In 1B 0.054 2B 0.163 3B, < 2 outs 0.518 3B, 2 outs 0.236So, using these averages and the specific opportunies for each batter, I can determine how much better or worse he was at driving in runs than the average major league batter.
Let's take Miguel Tejada's 150 RBI campaign in 2004 as an example. Tejada hit 34 home runs that year, meaning he drove in 116 runners, a very high total. In fact, that's the highest single-season total achieved in the period 2000-2005. He had 381 plate appearances with runners on base (also the highest in this period) with a total of 536 runners on. So, Miggy had a very large number of RBI opps in 2004. We can break it down by the bases occupied when Tejada came to bat and how successful he was in driving the runners in. Here is a table showing the relevant numbers:
+----------+--------+------+------+-------+------+-------+------+-------+------+---------+ | Batter | pa_rob | rob | r1 | frac1 | r2 | frac2 | r3 | frac3 | r3_2 | frac3_2 | +----------+--------+------+------+-------+------+-------+------+-------+------+---------+ | Tejada | 381 | 536 | 248 | 0.093 | 172 | 0.203 | 69 | 0.696 | 47 | 0.213 | +----------+--------+------+------+-------+------+-------+------+-------+------+---------+ r1 = runners on 1B, frac1 = fraction of 1B runners driven in, etc. r3 = runners on 3B, fewer than 2 outs r3_2 = runners on 3B, 2 outsTejada drove in over 9% of his runners from first base, compared to the league average of 5.4%. He was also much better at driving in runners from second base and from third base with fewer than two outs. When you actually do the arithmetic, you find that given the runners on base that Tejada had in 2004, the average hitter would have knocked in 88 runs, while Miggy drove 116 across the plate. The difference between those two numbers (let's call it "Diff") +28 is very good. It shows that while Tejada had a huge number of opportunities, he made the best of them and drove in 28 more runs than an average batter would have.
Two adjustments I do not make: the speed of the runners on base and adjustments for home ballpark. Okay, let's get to the results.
The Best RBI Men
I've got pbp data loaded for the years 2000-2005, so I'll be looking at the best RBI seasons, as measured by Diff, during that period. Here are the Top 20 RBI seasons over the last six years:
Top 20 Single-Season RBI Performances, 2000-2005 +-----------------------+------+--------+---------+------+------+ | Name | year | pa_rob | exp_RDI | RDI | Diff | +-----------------------+------+--------+---------+------+------+ | Teixeira_Mark | 2005 | 355 | 64 | 101 | 37 | | Helton_Todd | 2000 | 347 | 71 | 105 | 34 | | Cirillo_Jeff | 2000 | 335 | 72 | 106 | 34 | | Delgado_Carlos | 2003 | 342 | 70 | 104 | 34 | | Giambi_Jason | 2000 | 331 | 61 | 94 | 33 | | Delgado_Carlos | 2000 | 329 | 64 | 96 | 32 | | Gonzalez_Juan | 2001 | 318 | 75 | 107 | 32 | | Thomas_Frank | 2000 | 309 | 69 | 101 | 32 | | Ortiz_David | 2005 | 351 | 70 | 101 | 31 | | Helton_Todd | 2001 | 352 | 66 | 97 | 31 | | Tejada_Miguel | 2002 | 351 | 68 | 97 | 29 | | Martinez_Edgar | 2000 | 354 | 79 | 108 | 29 | | Ortiz_David | 2004 | 323 | 70 | 99 | 29 | | Tejada_Miguel | 2004 | 381 | 88 | 116 | 28 | | Pujols_Albert | 2002 | 358 | 65 | 93 | 28 | | Berkman_Lance | 2002 | 307 | 59 | 87 | 28 | | Rolen_Scott | 2004 | 300 | 63 | 90 | 27 | | Ramirez_Manny | 2005 | 328 | 72 | 99 | 27 | | Pujols_Albert | 2003 | 301 | 55 | 81 | 26 | | Sweeney_Mike | 2000 | 373 | 90 | 116 | 26 | +-----------------------+------+--------+---------+------+------+ RDI = runners driven in, i.e. RDI = RBI - HRI have mentioned that there is no park adjustment, so the presence of Helton (twice) and Cirillo on this list should be taken with a grain of salt. The Rangers also play in a park favorable to hitters, but Mark Teixeira's 2005 mark of +37 is very impressive nonetheless. It's interesting that, by this measure, Tejada's 2002 season (131 RBI) actually was a touch better than the 2004 season that we looked at above.
We can also look at the whole period together to find out who is the best RBI man of recent times. The answer is ... well, here's the leader board:
Top 20 RBI Batters, 2000-2005, Ranked by Diff +-----------------------+--------+---------+------+------+ | Name | pa_rob | exp_RDI | RDI | diff | +-----------------------+--------+---------+------+------+ | Anderson_Garret | 1843 | 373 | 500 | 127 | | Helton_Todd | 1919 | 368 | 490 | 122 | | Ramirez_Manny | 1771 | 377 | 497 | 120 | | Guerrero_Vladimir | 1722 | 326 | 443 | 117 | | Delgado_Carlos | 1820 | 372 | 487 | 115 | | Sweeney_Mike | 1564 | 323 | 436 | 113 | | Pujols_Albert | 1625 | 310 | 422 | 112 | | Tejada_Miguel | 2002 | 422 | 531 | 109 | | Rodriguez_Alex | 1979 | 390 | 484 | 94 | | Sheffield_Gary | 1824 | 368 | 457 | 89 | | Ordonez_Magglio | 1571 | 324 | 411 | 87 | | Bonds_Barry | 1200 | 208 | 293 | 85 | | Giambi_Jason | 1725 | 325 | 406 | 81 | | Ortiz_David | 1560 | 331 | 409 | 78 | | Rolen_Scott | 1634 | 336 | 413 | 77 | | Kent_Jeff | 1950 | 401 | 477 | 76 | | Berkman_Lance | 1743 | 351 | 427 | 76 | | Gonzalez_Luis | 1765 | 335 | 409 | 74 | | Walker_Larry | 1277 | 265 | 338 | 73 | | Chavez_Eric | 1742 | 343 | 414 | 71 | +-----------------------+--------+---------+------+------+Friends, Garret Anderson has been an RBI machine, driving in 127 more runners than expected over the last six seasons. I was a little surpised to see Anderson top the list, but it makes sense: the guy rarely strikes out or walks and he hits the ball hard. Mike Sweeney is another player in the same mold. Overall, this is a list of some pretty great players. Barry Bonds, despite the smallish number of opportunities (due to the many intentional walks he's received), still ranks among the leaders.
Actually, the above list is biased towards players that have had many RBI opportunities. If you're 10% better than average, then your Diff value will grow with opportunities. So, I would like to present another table, sorted by Diff per 300 plate appearences with runners on (called Diff300). This puts all players on an even footing. (The 300 plate appearances with runners on base represents a typical season's worth.) Here are the top 20 according to Diff300 (minimum 750 opportunities):
Top 20 RBI Batters, 2000-2005, Ranked by Diff300 +--------------------+--------+---------+------+------+---------+ | Name | pa_rob | exp_RDI | RDI | diff | Diff300 | +--------------------+--------+---------+------+------+---------+ | Sweeney_Mike | 1564 | 323 | 436 | 113 | 21.6 | | Teixeira_Mark | 900 | 173 | 237 | 64 | 21.3 | | Bonds_Barry | 1200 | 208 | 293 | 85 | 21.3 | | Pujols_Albert | 1625 | 310 | 422 | 112 | 20.7 | | Anderson_Garret | 1843 | 373 | 500 | 127 | 20.7 | | Guerrero_Vladimir | 1722 | 326 | 443 | 117 | 20.3 | | Ramirez_Manny | 1771 | 377 | 497 | 120 | 20.3 | | Helton_Todd | 1919 | 368 | 490 | 122 | 19.1 | | Delgado_Carlos | 1820 | 372 | 487 | 115 | 19.0 | | Walker_Larry | 1277 | 265 | 338 | 73 | 17.2 | | Ordonez_Magglio | 1571 | 324 | 411 | 87 | 16.7 | | Tejada_Miguel | 2002 | 422 | 531 | 109 | 16.4 | | Everett_Carl | 1310 | 257 | 323 | 66 | 15.2 | | Ortiz_David | 1560 | 331 | 409 | 78 | 15.1 | | Sheffield_Gary | 1824 | 368 | 457 | 89 | 14.7 | | Gonzalez_Juan | 941 | 195 | 241 | 46 | 14.6 | | Matsui_Hideki | 1029 | 211 | 261 | 50 | 14.5 | | Rodriguez_Alex | 1979 | 390 | 484 | 94 | 14.3 | | Rolen_Scott | 1634 | 336 | 413 | 77 | 14.2 | | Giambi_Jason | 1725 | 325 | 406 | 81 | 14.1 | +--------------------+--------+---------+------+------+---------+It's mostly the same guys reshuffled, with the notable inclusion of Teixeira who has only three seasons under his belt. There are a couple of other new names, as well: Carl Everett and Hideki Matsui. (Does anybody else find it curious that one of these is named after a dinosaur and the other one doesn't believe in dinosaurs?)
Some Interesting Cases
In the table below I've listed a few players that had interesting RBI numbers (shown in bold font). I've also included some other players who are nearbyi in RBI ability, for comparison.
+--------------------+--------+---------+------+------+---------+ | Name | pa_rob | exp_RDI | RDI | diff | Diff300 | +--------------------+--------+---------+------+------+---------+ | Beltran_Carlos | 1665 | 334 | 396 | 62 | 11.1 | | Abreu_Bobby | 1872 | 373 | 437 | 64 | 10.2 | | Konerko_Paul | 1705 | 346 | 404 | 58 | 10.2 | +--------------------+--------+---------+------+------+---------+ | Sosa_Sammy | 1690 | 342 | 383 | 41 | 7.2 | | Molina_Bengie | 1241 | 261 | 288 | 27 | 6.6 | | Thome_Jim | 1680 | 349 | 383 | 34 | 6.1 | +--------------------+--------+---------+------+------+---------+ | Spiezio_Scott | 1116 | 239 | 242 | 3 | 0.7 | | Snow_J.T. | 1234 | 275 | 278 | 3 | 0.7 | | Jeter_Derek | 1626 | 316 | 319 | 3 | 0.5 | | Jones_Andruw | 2011 | 414 | 417 | 3 | 0.4 | | Lieberthal_Mike | 1231 | 251 | 252 | 1 | 0.2 | | Polanco_Placido | 1346 | 247 | 247 | 0 | -0.1 | +--------------------+--------+---------+------+------+---------+ | Gonzalez_Alex | 1358 | 273 | 245 | -28 | -6.2 | | Dunn_Adam | 1270 | 243 | 216 | -27 | -6.5 | | Diaz_Einar | 809 | 164 | 146 | -18 | -6.8 | +--------------------+--------+---------+------+------+---------+I've included Bobby Abreu in the table, because apparently there are some who believe that Abreu would rather take a walk than drive in a run. Here's a quote from an recent article by S.I.'s Tom Verducci, "He's the kind of hitter who is happy with a walk in run-scoring situations, which sometimes leads to looking at third strikes." Abreu has been as adept at driving runners in as Carlos Beltran or Paul Konerko, and I don't hear anybody complaining about those guys. (Caveat: the numbers are through 2005).
Bengie Molina? Yep, hanging with Sosa and Thome as a solid RBI man. I have no further comment.
I was very surprised to find that Derek Jeter is just average at driving in runs. I mean, I think many people would love to have Jeter step to the plate when an RBI is needed. But, in fact, he's no better at driving in runners than Scott Spezio, say, or Mike Lieberthal. And Andruw Jones' high RBI totals (he'll top 100 for the fifth time this season) have mostly been due to lots of opportunities. Given the same number of chances, J.T. Snow, say, or Placido Polanco, would drive in as many runs as Andruw. And finally, we come to Adam Dunn, who is really not very good at getting runs in. He ranks below Alex Gonzalez, which says it all. Don't ask me which Alex Gonzalez, because I don't know. But does it really matter?
Final Disclaimer
I realize that my method of determining the best RBI producers is not perfect. Garret Anderson (circa 2001) may well have the best chance of driving in a runner on base, and in that sense he could be preferable in a given situation to, say, Manny Ramirez. However, Anderson also has a greater chance of making an out than Ramirez does, so maybe you'd rather have Ramirez up there after all. You probably would, actually.
Still, I think the things I've learned while doing this are interesting and possibly even useful. Who knows? Imagine this future hypothetical scene:
The Yankees are trailing by a run in the eighth inning, two outs, runners on second and third. Bengie Molina, the new Yankee backup catcher, is sitting on the bench, dreaming about the post-game spread. Next to him, manager Joe Torre pulls from his back pocket a 4x6 index card with "RBI DIFF" written across the top. After studying the card for a few seconds, he turns to the player sitting to his left and growls, "Molina, wake up and grab a bat. You're hitting for Jeter."
References and Resources
- Play-by-play data for seasons 1957-1998, 2000-2005 can be obtained at Retrosheet. And it's all free!
- A more serious and rigorous look at the RBI from the sabermetric viewpoint is provided by Tom Ruane, here. It is an excellent study.
John Walsh dabbles in baseball analysis in his spare time. He welcomes questions and comments via e-mail.






