Normally, when a 20-something-year-old types “fa” into their browser and presses Enter, he is taken to a website owned by a certain multi-billion dollar social media software company.
When I, a 20-something-year-old, type “fa” into my browser, I am not taken to said social media website. I am taken to FanGraphs.com. This happens because I enjoy perusing baseball statistics more than I enjoy interacting with my friends (without actually interacting with my friends).
While perusing said baseball statistics, I commonly find myself on the leaderboards. On these leaderboards, I can view, for example, the players with the most home runs, or the players with the most Wins Above Replacement, or the players with the highest RE24.
You heard me. RE24. If you follow me on Twitter, or if you’ve read pretty much anything I’ve written in the past few months, you’ll know that I love RE24. I love RE24 not because it is a perfect statistic, but because it is simple, because it is a great gateway into sabermetrics for our RBI-inclined friends, and because it considers context.
I love context, and I don’t think we talk about it or think about it or research it enough in sabermetrics. Context is not terribly helpful for predicting the future, or for winning your fantasy baseball league, or for helping teams find market inefficiencies, or for whatever else would get one hired by a baseball organization. But it’s fun. It’s interesting. It tells a story. It leads to new ways of viewing and evaluating players and teams.
We can measure context in a number of ways. RE24 is one of them. Win Probability Added is another. WPA is probably the more popular of the two, and I certainly enjoy using it in certain situations; however, I believe that RE24 is a better way to consider context than WPA, which I defend here.
The basic premise is that WPA looks only at what has happened in the game so far, but can’t consider the context of the entire game. For this reason, I prefer to look only at run expectancy, rather than win probability. Nevertheless, whether you agree with my reasoning or not, what will follow may still interest you as a new way to consider context and clutch.
So we have RE24 as a way of measuring context-dependent offensive contribution. That’s wonderful for a wide variety of uses, including creating your own WAR, but why should we stop there? Surely we can think of other ways to use this wonderful context-dependent metric, right?
Right! That other way is clutch. I know, I know—clutch isn’t a very popular word for us statheads. But bear with me. Again, we’re not trying to predict the future; we’re just trying to find other ways to evaluate the past.
You may be aware that there already exists a “Clutch” metric, which, crudely, measures how well a player performs in high leverage situations relative to how well he performs in all situations. It’s essentially the difference between a player’s actual WPA and what one would expect his WPA to be if he performed at the same level regardless of the situation.
Since we’ve already decided to use RE24 instead of WPA, let’s transfer that idea to RE24. Because RE24 measures runs “produced” relative to the average player, all we need to do is find some way to measure the number of runs a player produces above average, independent of the situation/context. And wouldn’t you know, the core offensive component of WAR, wRAA, does exactly that.
So, all we have to do now is subtract wRAA from RE24, right? Unfortunately, no. In my research for this article, I initially thought it was this easy, but I soon noticed a disconcerting trend in the numbers: Players on teams like the Rockies and Red Sox consistently had a higher wRAA than RE24, and players on teams like the Mets and Padres had a higher RE24 than wRAA. That’s right—there was a park bias.
Upon further investigation, I realized that RE24, contrary to my previous assumption, is park-adjusted—that is, it uses run expectancy values that are tailored to the park, rather than ones that are uniform across baseball. This threw quite a wrench into my plan, as I don’t have the statistical or programming chops to calculate non-park-adjusted RE24.
My first thought was to use FanGraphs’ Batting runs, or Bat for short, which is the park-adjusted version of wRAA. Unfortunately, however, Bat swings the difference too far in the other direction. Players who played in places like Coors had a significantly higher RE24 than Bat, whereas it was the other way around with wRAA. Additionally, I believe that Bat removes pitcher plate appearances from its calculation of average wOBA, thus making Bat slightly higher than RE24 on average.
To be honest, I don’t know why RE24-Bat has a park bias if RE24 uses park-adjusted run expectancies. If you know, I would love to hear the explanation. But in order to present these numbers in a non-misleading way, my very hacky solution was to simply take the average difference in (RE24-wRAA)/PA for every team since 1974, and subtract that difference from every player-season (RE24-wRAA)/PA based on the team for which that player played. It’s not pretty, but it does the job, and the end result should be slightly closer to what we want.
Are you tired of reading words? Yes? Good, because I’m tired of writing them! Let’s get to the charts. For brevity’s sake, and because the actual formula is not simple, I will refer to my adjusted (RE24-wRAA)/600PA as SitHit (short for situational hitting). As explained above, the number, in essence, measures the difference between a player’s context-dependent run production and his context-independent run production—or, his situational hitting.
Career leaders in total SitHit since 1974
If you didn’t already want Tim Raines to be in the Hall of Fame, this may convince you. Raines had 313 Batting Runs, which was already a great number to go along with fantastic speed, but his RE24 was 503, almost 200 runs higher! Add 20 wins to Raines’ already-impressive resume, and he is as sure-fire a Hall of Famer as they come.
Career leaders in average SitHit per 600 plate appearances since 1974
B.J. Surhoff is an interesting name to see at the top of these lists, as he was actually a below average hitter in his career. However, his excellent situational hitting, along with great defense, may make him one of the more unappreciated players in recent memory.
Top qualified seasons by SitHit since 1974
Tom Herr takes the crown for best season ever by situational hitting. While he was only 20 runs above average by wRAA, he was almost 60 runs above average using context-dependent RE24! That’s essentially the difference between Torii Hunter and Miguel Cabrera last season. Herr hit .356 with runners on base compared to .255 with the bases empty, not to mention a .396 with no outs.
And finally, just for fun, let’s look at the worst seasons by SitHit since 1974
|8||Dwight Evans||563||1979||Red Sox||-23.02|
|14||Dave Stapleton||581||1982||Red Sox||-21.94|
|16||Jody Reed||619||1989||Red Sox||-21.84|
|18||Rick Burleson||721||1977||Red Sox||-21.71|
Yikes, that’s bad. In 2009, Cano had a wRAA of 25.4, but an RE24 of -7.4. In other words, there was over a three-win difference between his context-dependent and context-independent production. It turned out pretty well for the Yankees in the end, but some better situational hitting from Cano could have made their great season even greater.
There you have it. I realize that much of this article was simply my own thought process for this idea and research, but I hope you found it somewhat interesting. At the very least, SitHit, or RE24 or some other variant, is a new way to consider context, clutch, and contribution.