Adjusting RE24 for baserunningby Matt Hunter
July 25, 2013
In 2012, Jason Heyward was worth seven and a half runs on the base paths, independent of his stolen bases and caught stealings. In other words, his ability to advance from first to third and second to home on singles, and first to home on doubles, led the Braves to score about seven or eight more runs on the season than they would have had an average baserunner replaced Heyward. This is quantified in UBR, or Ultimate Base Running.
UBR is one aspect of FanGraphs' WAR formula, because WAR's offensive component, Batting Runs, does not take the actual changes in base/out state into consideration. That is, Batting Runs cares only about the event itself, not the game context in which the event occurred.
But guess what does consider the beginning and end base-out states. That's right: RE24! A reminder, in case you forgot: RE24 is a cumulative statistic that measures the change in run expectancy from the beginning to the end of the play. Hit a single with no one on and no one out? Your RE24 is just the run expectancy with a man on first and no outs minus the run expectancy with the bases empty and no outs. Sound reasonable? I think so.
But what about RE24 with runners on base? If, say, Dan Uggla hits a single when Jason Heyward is on first and no outs, and Heyward ends up advancing to third, Uggla gets credit for the change in run expectancy from runner on first with no outs to runners on first and third with no outs. But if you recall from the first paragraph, Jason Heyward was excellent (best in the league, in fact) at non-stolen base baserunning, yet Uggla receives all the credit if we use RE24.
Let's try to fix that. Remember, we want to give the hitter credit for the actual context-dependent value of a hit with runners on base, but without also giving him credit for the baserunning of those ahead of him. This is important, because unlike, for example, the quality of the pitcher or the quality of the defense, each hitter is likely to have only a limited variety of baserunners ahead of him in a given season.
So how do we adjust RE24 to remove baserunning? My first thought when I was brainstorming this question was to simply figure out who was on base in front of each hitter during the season and subtract some portion of their UBR from the hitter's RE24. However, obviously, that's far too simple, for it assumes that a baserunner's value is evenly distributed among opportunities. This is not true, so we must be more specific about how we make this adjustment.
Then, it hit me. Instead of just looking at the beginning state and end state, which credits this entire change to the hitter, we could instead look at the event—that is, single, double, strikeout, etc—and figure out where we expect the baserunners to be at the end of the play. In other words, we simply need to find the average change in run expectancy for each event for each base/out state, and apply these values to each play. So if Dan Uggla hits a single and Heyward moves to third with no outs, Uggla will not get credit for the entire change in run expectancy, but instead will only receive credit for what the run expectancy would be if an average runner was on first, or the average run expectancy for a single in a man on first, no out situation.
Of course, there are some issues with this approach. First of all, unless I use many many years of data, some of the event/base/out combinations are going to have very small sample sizes. Triples are already rare, so a triple with a man on third and no outs will be even more rare (turns out this only happened seven times from 2010 to 2012). And while we could just leave triples out of this adjustment, there is a possibility of a baserunner being thrown out at home on a triple, so it is best to include them. Luckily, any event that has a very small sample size with regard to run expectancy will not have a significant impact on a player's baserunning-adjusted RE24, so it is an issue that I can ignore for the sake of this non-scientific article.
The other issue with this approach is that we do not want to adjust all events for baserunning, because we want to award, for example, "productive outs". I know, I know, that's not a popular phrase in sabermetrics, and I'm not saying that it should be encouraged. However, if want to measure context-dependent offensive contribution with run expectancy, we can't assume that every out (by base/out state) is equal; it is better for hitters to hit a grounder to the right side with a runner on second than a pop-up to shallow left. Yes, there is some baserunning skill in outs, but my intuition, and hopefully yours as well, is that we should assign more credit to the hitter for productive outs than the baserunner.
I hope my reasoning makes sense. Before I give you my adjusted RE24 values, I want to make clear that these are not perfect measurements. I did not use Markov chains to calculate the run expectancy values, as I would/should if these data were to be actually used. I also didn't necessarily assign all the proper credit where credit is due; however, you would likely have to go through every play individually to really get this right, and even then a lot of guesswork and subjectivity is involved. The purpose of this article is to simply get an idea of what happens when we remove baserunning from the equation, and who it helps and hurts the most.
A few final notes:
- I created my run expectancy values from the 2010-2012 seasons.
- I did not adjust for park or league, so keep that in mind if you use the results to rank players (which you probably should not do).
- "RE24" is not the same as the RE24 on FanGraphs or Baseball-Reference, so don't be confused by the differences you may find.
- I did not include stolen bases and caught stealings in either version of RE24.
First of all, let's look at the players who had the largest positive difference between their RE24 and their adjusted RE24. These are the players that benefited most from good baserunning ahead of them.
These names, when you look at the UBR leaderboard, are pretty obvious. Mauer's difference comes largely from Ben Revere and Denard Span, whose impact was even greater because of the large number of singles and doubles that Mauer hit. Alex Gordon can be explained by Alcides Escobar, and Torii Hunter can be explained by none other than Mike Trout.
Interestingly, the first Braves player on that list isn't until number 24 with Chipper Jones. So, although Jason Heyward had ridiculous baserunning numbers, the effect may have just been distributed more evenly among his teammates.
Next, we'll look at the players with the biggest negative difference between the two; that is hitters with bad baserunning in front of them:
Hanley Ramirez, for some odd reason, leads the pack. I say some odd reason because I can't for the life of me figure out why the baserunning in front of him was so bad. For reference, take a look at the runners who were on first base when Hanley came up to bat:
Doesn't look like -8 runs of bad baserunning to you, does it? The numbers for runners on second are very similar, so I'm quite perplexed. It could be the case that these runners just happened to have bad baserunning blunders on Hanley's plate appearances, or that bad baserunners like Adrian Gonzalez were on most often when Hanley hit singles or doubles. After Hanley, Cruz's difference is explained by the bad baserunning of Adrian Beltre and Michael Young, and Altuve by Jordan Schafer, though the extent of the difference is, like Hanley, perplexing.
Removing baserunning from RE24 doesn't do a whole lot for the majority of players; at most, it will change the result by seven or eight runs, but more often it will be around one or two. However, regardless of the practical effects, it's important to think about how we assign credit and blame for both context-independent metrics like wOBA and context-dependent metrics like RE24. This is just one of the ways in which we can do that.
If you'd like to see these numbers for all players, I made a full Google spreadsheet here.
Thanks to Retrosheet, FanGraphs, and Baseball-Reference for the data.
Matt writes for FanGraphs, Beyond the Box Score, and the Hardball Times. You can contact him via Twitter @MRHBaseball or email.