For a few months now, I’ve been touting RE24 not only as a fantastic compromise between advanced methods of player evaluation like wOBA and an old-school focus on the RBI, but a better way of evaluating past performance for purposes such as voting for end-of-season awards.
However, I’ve realized that RE24 is incomplete, or perhaps, more appropriately, simply one of many implementations of a broader idea: the measurement of contribution based on the 24 base-out states. There are a number of ways in which RE24 is imperfect not just for evaluation, but for becoming accepted in mainstream analysis, including the fact that it sounds like a tax return form. But more importantly, or perhaps more relevant to the topic of this piece, is that RE24 is (1) set on an unfamiliar scale for most fans,(2) cumulative, or not a ratio, and (3) dependent on opportunity. Before we get to the meat of the article, let’s discuss those three factors.
There is nothing wrong with the “runs above average” scale. In fact, this scale is absolutely crucial for a calculation like WAR, which compares each component to average before adjusting for replacement level. In that way, like all of these reasons, the fact that RE24 is measured on a scale that is unfamiliar to most fans is not inherently negative. However, if the metric, or the idea from which the metric is drawn, is to take hold in mainstream analysis, it would behoove us to set it on a more familiar scale.
On that note, there are two opposite directions we can take this change in scale. The first is to try to appeal to those who prefer, or see some benefit to using, the RBI. Instead of being scaled such that average is zero, RBI is scaled to reflect actual runs scored. If RE24 were to be scaled to RBI such that the sum of a team’s offensive RE24 closely resembled actual runs scored, one could compare the two methods of evaluating run production, perhaps giving RBI enthusiast a better hill to stand on.
While I’m not sure how exactly I would go about moving RE24 to an RBI scale (though that may be a topic for a future post), there is another direction that we can take this scale: wOBA. Or, more accurately, OBP.
One of the benefits of wOBA, apart from its fantastic ability to appropriately assign weights to the various types of events, is that it is set on the same scale as on-base percentage. This is not only a benefit because it is easier to quickly read and evaluate, but because it is calculated such that an out is always zero and the average positive event, or the numerator, is always one. In other words, the weights of wOBA are such that the average player will have a numerator of (plate appearances minus outs)/(plate appearances), which is (basically) the same formula as OBP.
There’s one other benefit to scaling RE24 to the wOBA scale, one that addresses the third point above: if each plate appearance is measured such that an out is zero and the average event is one, then each plate appearance has the same level of importance. In other words, a player will not be rewarded in this rate version of RE24 just for coming up in many high run expectancy situations.
Of course, part of the appeal of RE24 for some people is the fact that it awards some plate appearances more than others. A bases loaded hit, according to RE24, is more important and more valuable than a bases empty hit. If you agree, then the data that follow may not be for you. They will not measure clutch hitting in the traditional sense, but situational hitting, in what I see as the most fundamental meaning of that term.
This base-out wOBA, as I will call it, measures how well the hitter performed given the situations in which he came to the plate. It measures not when he got big hits, but when he got the right hits. In other words, it measures offensive ability such that the importance of each type of event is completely tailored to the situation at hand, ignoring the importance of said event in any other situation.
It was initially difficult for me to wrap my head around how I would calculate this base-out wOBA. However, once I understood that the average positive event in each situation, or base-out state, would average one, then the method became clearer. Luckily, I already had data from previous research on the run values of each event for each base-out state. These data were also calculated by Tom Tango a while back, and can be found here.
With these data, along with the frequency of each event in each of the 24 base-out states, I just had to scale the run values of each positive event (1B+ROE, 2B, 3B, HR, BB+HBP) such that the sum of (run value of event)*(frequency of event) equals one. (I had to phone a friend to help me with this math, but the result was the addition of wOBA weights to this “linear weights by base-out state” table.)
I’ll give you a link to the full list of these values at the end, but I’ll tell you now that some of these values were very extreme. At the top of the list was the wOBA weight for a bases empty, two-out home run at 4.6. At the bottom of the list is a walk or hit by pitch with men on second and third and two outs, at 0.2. For reference, these are the standard wOBA values offered by Tango a few weeks ago:
In other words, a bases empty, two-out home run, using base-out wOBA (or bowOBA), is worth more than twice as much as a normal home run in an average situation. A player could hit one of these home runs and then go 0 for his next 10, and still end up with a bowOBA north of .400, compared to a .181 standard wOBA. On the other hand, a player who has a perfect 1.000 on-base percentage, but somehow gets on base only via the walk, and only in -23 situations, would have a horrible .200 bowOBA, compared to a .700 standard wOBA.
These results sound absurd at first, but consider the situations. With two outs and no one on, there is very little chance of a team scoring a run. If the batter gets on first, the run expectancy increases slightly (.14 or so) but even with a triple, the chances of scoring remain low. However, a home run, obviously, is an instant run. It is, by far, the best thing that can happen in that situation. Sure, a home run in that situation may not change the run expectancy by as much as a grand slam, but relative the value of every other event, this home run is by far the most valuable.
The same thing applies to the second and third two-out walk. In that situation, there is little difference, relatively, between a single, double, triple and home run. Each will definitely score one run, and likely two. However, a walk almost certainly scores no runs, and because there are two outs, leads to a situation with a good possibility of no runs scored. Is a walk in that situation a bad thing? Of course not —it still increases the overall run expectancy in the inning. But when one considers the value of every other event in that situation, the walk does very little to change the course of the inning.
With these bowOBA weights in mind, I looked at the 2012 season to see how the results compare to both standard wOBA as explained by Tango above and FanGraphs’ version of wOBA. Here are the leaders for 2012, minimum 400 plate appearances:
|Num||Name||FG wOBA||std wOBA||bowOBA|
As you can clearly see, the results are just a bit strange. The order of the players is not out of the normal, though there are some interesting differences between the versions of wOBA, but the scale of bowOBA seems to be much higher than that of the other two versions. Miguel Cabrera’s bowOBA is over 50 points higher than his standard wOBA, and every player on this list has a higher bowOBA, with the exception of Joey Votto (this is interesting given some criticisms that Votto walks too much with RISP, just like the example presented above).
I initially thought that I just had the entire scale too high, but consider then the laggerboards in bowOBA, again minimum 400 PAs:
|Num||Name||FG wOBA||std wOBA||bowOBA|
As you can see, almost every one of these players actually has a lower bowOBA than standard wOBA, though not to the same degree as the leaders. Still, the problem, if it is indeed a problem, with bowOBA seems to lie more with the range of values than the scale.
This is where you come in. I cannot quite figure out why bowOBA so strongly favors good hitters, especially good power hitters. As you’ll see in the spreadsheet of bowOBA weights here (along with the full leaderboard in the other sheet), home runs have a very high weight, higher than the standard weight, for most of the base-out states. Whether this is an issue with my calculations, my methods, or no issue at all, I am not sure, but I would love to hear your feedback.
Regardless of the validity of the numbers themselves, I stand by the idea that the baseball and the sabermetric community would benefit from more applications of RE24, specifically applications that transform the core idea behind RE24 into a more recognizable, rate-based, scale. What I have presented a above is one attempt at that. Let’s see more.
References & Resources
Thanks for Retrosheet and FanGraphs for the data, and Tom Tango, as always, for the inspiration.