The opportunity of RBI

A well known criticism of runs batted in is that it measures opportunity as much as it does performance. Instead of discarding RBI as a helpful metric, however, maybe we should look at it from the other direction and account for each batter’s opportunity. Combining performance and opportunity in one metric can bring more validity to RBI.

Let’s define opportunity as the expected number of runs an average batter would drive in, given a specific base/out situation. We can then create a ratio from a batter’s actual RBI (aRBI) to his expected RBI (eRBI) and compare one batter to another to show which performed better.

As far back as 1998, you can find other baseball minds looking into the same concept. Tom Ruane laid out a very good set of examples and explanations (see the link below under References). The concept requires finding out what the average outcomes of at-bats were in each season, such as a home run 3 percent of the time. Based on those outcomes, an expected RBI value is assigned to each of the 24 base out states, ranging from none to two outs, and from bases empty to bases loaded.

Next you isolate the number of at-bats a player encountered in each of these 24 base out states, and add in all of expected RBI values. Once a total expected RBI is calculated, you find the actual RBIs produced in the same situations, and divide by the eRBI to create a ratio.

This newly found ratio, which we’ll call actual/expected ratio (aeRatio), can be used in many different ways, but I will focus on three of the most important today. First, we will highlight the players who made the most of limited situations. Good hitters who suffer from the lack of teammate performance, poor lineup construction, or being walked in many high leverage appearances will all be recognized for their production despite these setbacks.

Second, hitters who have gaudy RBI totals despite their lesser relative performance will be exposed. While our eyes are drawn to triple-digit RBI totals, aeRatio can parse out the players whose opportunity played a more important role than performance.

Third, a ratio metric allows us to apply different hitters’ abilities into other hitters’ opportunities. When one team’s cleanup hitter is driving in 50 more runs than another team’s, we can now step into each player’s totals and see how much should be attributed to the individual player, how much to the team he plays for.

Using data from 2012, let’s look at a couple of leader boards. The list below displays the top 10 batters who drove in runs at the highest rates:

Batter                    eRBI    aRBI   aeRatio
Edwin Encarnacion         60.4     107      1.77
Josh Hamilton             72.9     127      1.74
David Ortiz               35.8      60      1.68
Evan Longoria             32.6      53      1.63
Miguel Cabrera            86.3     138      1.60
Jose Bautista             40.0      63      1.58
Ryan Braun                70.7     111      1.57
Giancarlo Stanton         55.3      84      1.52
Alfonso Soriano           71.4     108      1.15
Garret Jones              57.3      85      1.48

An aeRatio above 1 suggests that a player outperformed the average hitter in terms of driving in runs. It is a simple ratio, so a ratio of 1.5 is basically saying that a hitter is producing 1.5 aRBI for each full eRBI he encounters. An aeRatio below one works the opposite way, suggesting a hitter was worse than average and produced a fraction of an aRBI for every eRBI.

Atop our list is the Blue Jays’ newest breakout slugger, Edwin Encarnacion, who had far and away his best season at the plate. Despite his dominance last season, Encarnacion finished with only 107 aRBI, 30 fewer than Miguel Cabrera, but you can see that Cabrera had 26 more eRBI, which gave him a clear advantage for the RBI title.

A simple calculation suggests that Encarnacion would have had (1.771 * 86.25 =) 153 RBI if he had been presented with the same opportunities as Cabrera, while the 2012 Triple Crown winner would have only totaled (1.6 * 60.414 =) 97 RBI if he had been given Edwin’s opportunities. I am not suggesting that every single factor would remain the same if the two simply swapped uniforms, but I’m just supplying a method of equalizing players’ RBI totals from one team to another.

To identify the players who were put in the most fortunate situations, we will sort the list for highest eRBI totals.

Batter                    eRBI    aRBI   aeRatio
Hunter Pence              94.3     103      1.09
Miguel Cabrera            86.3     138      1.60
Marco Scutaro             83.5      73      0.88
Chase Headley             83.3     113      1.36
Hanley Ramirez            81.7      90      1.10
Starlin Castro            81.5      78      0.96
Adrian Gonzalez           81.2     108      1.33
Matt Holliday             81.2     102      1.26
Adrian Beltre             80.1     101      1.26
Billy Butler              80.0     106      1.33

The aforementioned Cabrera is once again on our list. This would explain why he was able to lead the league: an abundance of eRBI paired with his high aeRatio. Ahead of Cabrera in eRBI is Hunter Pence. However Pence’s aRBI was a much lower 103, giving him an aeRatio of 1.093. The ratio still suggests that Pence was above average, but at a much slimmer margin than other members of the 100 RBI club.

Starlin Castro also had a career-high RBI total in 2012, but the list above shows that was a blatant product of opportunity. Castro led the league in at-bats last year, which allowed him to pile up over 81 eRBI on the season. His sub-1 aeRatio suggest that he was unable to knock in as many runs as the average hitter would have.

While driving in runs is still only half of the equation for scoring runs, the ability to isolate and compare that part of the equation can allow us to see who excels and who are the less impressive products of opportunity. A ratio format also eliminates the flaw of “counting” statistics which accentuate the values of players with more at-bats in general. RBI is one of the most traditional statistics baseball has, and is basically untouched by sabermetrics. My hope is to no longer ignore this part of the run scoring equation because of its limitations, but rather to identify those limitations and create methods of enhancing it.

References & Resources
For more information, please visit the aeRatio website for the aeRatio database from 1992-2012, as well as a glossary and introductory information on the process. All of the work was made possible with data from Retrosheet.

A Hardball Times Update
Goodbye for now.

Tom Ruane’s work with expected RBI can be found here.


21 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Bill Rubinstein
11 years ago

Total Baseball, the baseball encyclopedia whose stats are compiled by Pete Palmer, has something called Clutch Hitting Index, which is apparently rbis compared with rbi opportunities, compiled throughout baseball history.(Total Baseball was last published in 2007, unfortunately). As of 2001 or so the all-time leader was Cap Anson, while Pie Traynor was no.4 or 5- this may be why he was so highly regarded.

InnocentBystander
11 years ago

I love this concept. It really gives RBI’s a usable context. Can you – yes, you – apply the same concept for relief pitchers? I feel like the Inherited Runners/Scored statistic is lacking something. But if you run the data through the 24 base out states it would probably be more meaningful.

Ryan
11 years ago

I think this is a good start, but the big question is this: is this a skill?  Is this repeatable?  Are the leaders last year the same as this year (roughly) or is this like BABIP where there can be wild fluctuations from year to year and an individual year’s result is not predictive?

Also, it seems odd that the highest eRBI you have is 94 and only 10 guys in baseball had 80. Did you include the chances of driving in yourself?  Seems strange that essentially everyone with at least 80 RBI would, by definition, be better than league average at driving in runs.

Jason Mitchell
11 years ago

innocent:  aeRatio eliminates walks so we can compare batter performance without penalizing someone from doing a good thing(walk) in high leverage situations.  That method would not work for pitchers at all.

In my opinion, the best stat for inherited runner situatoins would be RE24.  You can find it on baseball-reference.com and it measures the run expectancy of the inning when a pitcher enters, and when a pitcher exits.

Jason Mitchell
11 years ago

Ryan:  Yes, driving in yourself is accounted for.  It actually is one of the biggest drivers of the list, as creating an RBI in a bases empty situation is the hardest RBI to earn.

aeRatio is a skill.  As Steve commented, ISO is a very high correlation.  eRBI is a measure of opportunity, so skill there lies with a player’s preceding teammates, and not the player himself.

Steven
11 years ago

Soriano’s should be 1.51 not 1.15

Also, did you remove RBI’s from a walk as well?

Ian R.
11 years ago

Is there any adjustment for park factors in these numbers? It seems to me that if the eRBI are ballpark-neutral, hitters who play in high-scoring parks would have inflated ratios.

Jason Mitchell
11 years ago

Steven:  Yes, aRBI are RBI that result from at bat outcomes.

Ian R.:  No, they are not.  Beyond my ability at this point, but would love to include.  For instance, the eRBI of a man on 2nd and 1 out could be higher in Coors Field, where a bigger OF could allow a single to score a runner on 2nd at a higher rate than Yankee Stadium.

Duke
11 years ago

Outstanding work which provokes further questions.  If aRBI is a skill, there should be correlations from one year to another.  Also, could one index aRBI to something like wRC+ to show relative clutch hitting?  As an example Robinson Cano is 17% better than average in aRBI but 50% better in wRC+, so he seemed to underperform in the clutch.  Would this index of indices correlate year-to-year?

Xeifrank
11 years ago

Why assign an “Expected” RBI number to each at-bat based on the base state and the percent chance of a HR, single, double, triple, BB, Out etc… when you have the empirical data that tells you exactly how many RBIs were created in every state.

Also, do you see any benefit from comparing players to other players of similar talent, instead of comparing everyone to a league average player.  This will inflate the ratio numbers for the better players and deflate them for the weaker ones.  A ratio value of 1.0 for Barry Bonds is not equal to a value of 1.0 from Jamey Carroll.  It may add some more complexity but shouldn’t be too hard to come up with a proxy for this.

Overall an interesting article.  Thanks.
vr, Xeifrank

Scott
11 years ago

Nice article. The ratio aspect of the metric seems very correlated w/ ISO….makes sense: hit ball long way and runners advance farther/have a better shot to score.

gc
11 years ago

Batting after Adam Dunn who often walks but does not run fast might make you seem unproductive.

Jake
11 years ago

Can you show me Trout’s aeRatio?

bstar
11 years ago

Jason

Interesting look at it. This is better than OBI% from B Pro because you’re taking all situations (base-out state) into account instead of just adding up all runners on base and using that as the number of opportunities.

I find your decision to exclude walks entirely very interesting. For one, I wonder how much your top ten would change if you included walks (or at least NIBB).

Are you going so far with this as to suggest that these top ten guys in aeRatio are being underrated by WAR?

Jason Mitchell
11 years ago

Duke:  aeRatio focuses on one half of the equation, driving in runs.  wRC+ tries to encompass a player’s entire offensive game.

xeifrank:  I guess you could create a new baseline by position.  If all players are measured against the same average, it shouldn’t be hard to tell what standard to measure above average players with others.

Jake:  Mike Trout had a 1.39 aeRatio, you can see everyone’s through the link at the end of the article.

bstar:  You can find what I deemed “PA eRBI” through the link at the end of the article.  It shows a player’s eRBI accumulated in all of their plate appearances.  Guys like Willingham, Pujols, and Posey jump up the list.  You can not use PA eRBI to make an aeRatio as you’ll punish these players by walking, but only to see who had the most raw eRBI.

Lorditch
11 years ago

Nice!  No idea you were a stats man.

Jim G
11 years ago

Jason,

Great article, but will this help our Phillies?Look out Nate Silver.

Rian
11 years ago

Jason –

Great article.  A few questions:

(1) Wouldn’t it be more reflective of reality to use PAeRBI as the denominator of aeRAtio rather than ABeRBI?  Walks are, after all, an outcome, and can lead to RBIs in 3 out of 24 base/out situations?

(2.1) As several previous commenters have mentioned, aeRatio seems to be a skill on the same order as (and closely correlated to) ISO.  I would be very interested in an article examining the year-to-year correlation in player aeRatio (in the spirit of the work currently being done over at FanGraphs by Matt Klaassen).  Is this something you might be able to do?  I can’t find a way to download the data you used off of the aeRatio site, otherwise I’d do it myself.

(2.2) In that spirit, it seems to me that relating ISO and aeRatio conflates the two to a degree that obfuscates the significant differences between the statistics – differences that are deeply relevant to the question that motivated this post.

To wit: while ISO only measures individual outcomes, even aeRatio is dependent, to a degree, on team metrics.  I’m thinking particularly about speed.  If a player hits a double with Rod Barajas on first, Rod’s probably on third at the end of the play.  If that same player hits a double with Michael Bourn on second, he’s credited with an RBI and his aeRatio increases.  Critically, his ISO remains the same – and so *should* his aeRatio, if the metric is optimized.

I’d be very interested in seeing modifications made to this statistic that account for baserunner speed.  I’m not sure of the best technical route for making this happen, but a first step might be to account for team speed as a single variable, as has already been done for a number of metrics.  That’s obviously somewhat problematic as a Michael Bourn and a Rod Barajas might well be on the same team, but it’s a first step.

Anyhow, I love the concept.  Thanks for all your work!

Jason Mitchell
11 years ago

Rian: #1. aRBI removes RBI earned on non-AB outcomes (walks).  If you used eRBI earned in non-AB outcomes to determine aeRatio, you penalize a player for working a walk, who sacrificed his eRBI for his following teammate(s).

#2.1:  Good idea.

#2.2:  It could be done – I could assign a value to baserunners based on their history of advancing from each situation, and use it as a factor in the eRBI calculation for the hitters when those players are on base.

Great suggestions.

Rian
11 years ago

Jason,

Thanks for the quick response!  I’m looking forward to the followup!

Rian

the Flint Bomber
11 years ago

I’ll chime in to agree that I love this line of analysis.  Thanks, Jason!  It’s also obvious to me that this type of research would also lend itself well to discussing HOF candidacies.

Like, for my all-time favorite player, Will Clark.