Adjusting steals for win valueby Dan Turkenkopf
June 18, 2009
There's already been a lot of noise on the basepaths in 2009. Dexter Fowler stole a rookie record five bases against Chris Young in April. Carl Crawford bested him less than a week later when he stole six bases on Jason Varitek and the Red Sox pitchers. Jacoby Ellsbury made a splash stealing home behind Andy Pettitte's back and seemed to have started a trend. Crawford is currently on pace for almost 90 stolen bases this year, a number not reached in 20 years.
These incidents and more have caused some to ask whether the stolen base is re-emerging as an effective offensive weapon.
Rather than answer that question, I prefer to look backwards to rate the speedsters of the past. Which base runners have been the most effective stealing bases across history?
Unfortunately for this exercise, history can only go back as far as Retrosheet's play-by-play records, but that still gives us over 50 years to look at.
What do I mean by most effective? There's really two ways we can look at that; overall value, and rate value. We'll get to the particulars in a minute, but first let's take a little detour down the win probability path.
Adjusting stolen base attempts using win probability
Not all stolen bases are created equal. Stealing second base in the ninth inning of a 10-0 laugher is very different from stealing home with two outs in the bottom of the 10th in a tie game. This means we need to adjust the actual stolen base attempts to put them on equal footing.
As opposed to many events that take place on the diamond, a stolen base has some choice associated with it. Runners choose when to attempt a steal; basing their decision on score, number of outs, the man at the plate, the pitcher on the mound and a whole host of other things.
It's entirely possible that the circumstances when players decide to run have shifted over time. Some of that will be due to the run environment, but some of that might be for other reasons like risk of injury or changes in etiquette (Joe Morgan claims he never liked to run once his team was up by four runs).
But timing isn't the only adjustment we need to make to the win value of our base stealing events.
In The Book, Tom Tango, Mitchel Lichtman and Andy Dolphin found that the win value for a stolen base between 1999 and 2002 was .018 wins. On average, a runner who successfully stole a base would increase his team's chance of winning by roughly 1.8 percent. On the other hand, a runner who was thrown out attempting a steal cost his team .043 wins.
This means that the necessary success rate to break even—where a team was no better or worse off running or staying point—was right around 70 percent. Runners who exceed that number help the team, while runners below that number hurt the team.
The concept of a break even point holds true throughout baseball history. What changes, however, is the actual value of the break even point. As you can see above, that number is determined from the win value of the stolen base compared to the win value of the caught stealing.
Those win values are quite sensitive to the run environment. In a higher run environment, the stolen base is less valuable because there's a greater chance the batter will be able to drive in the runner from first with a double or a home run.
For the same reason, outs are more precious, which makes a caught stealing even more devastating. As an example, the win value of a stolen base in 1968 (the Year of the Pitcher) was .027 wins, while being caught stealing only cost the team .04 wins. This lead to a breakeven point of just below 60 percent.
Luckily we don't need to isolate the factors which caused changes in the win value. We can simply measure the win value of all stolen base attempts in each season to determine the average value for a stolen base and for a caught stealing.
Since looking at a single season's data causes some fluctuation in the win values, I calculated five year rolling averages (except for the first four seasons of course), which formed the baseline for my adjustments.
I determined the actual change in win value for every stolen base, caught stealing and pickoff from 1954 through 2008. I then calculated the equivalent number of events as if each occurred in the average situation. I realize that's confusing, so it's probably best to demonstrate with an example.
Rickey Henderson in 1982
In 1982, Oakland Athletics outfielder Rickey Henderson smashed the major league record with 130 stolen bases in a single season. This performance will be our case study to walk through the methodology.
For the purposes of this analysis, Rickey had 128 steals. My WPA calculator only gives credit to the leading runner on stolen bases, so being the back end of a double steal isn't worth anything. The actual win value of those steals was 2.93 wins. The average win value of a stolen base in 1982 was .023. So Rickey's 2.93 wins equates to just over 129 steals (2.93 / .023 = 129.4). I'll refer to these as equivalent stolen bases or steal equivalencies.
This means that Rickey stole his bases in slightly higher leverage situations than average (very slightly higher).
Our analysis isn't complete yet though. Besides looking at the successful steals, we also need to consider caught stealing. Ricky was caught stealing 31 times in 1982, and was picked off 17 more times.
You may have double-checked and seen that Baseball Reference debits Rickey for 42 times caught stealing that season. Unfortunately, not all pickoffs in the Retrosheet dataset are marked as caught stealing. I'm guessing it has something to do with whether the runner makes a move towards the next base. In calculating the win values, I've treated caught stealings and pickoffs separately.
Anyway, back to Rickey. His 31 times caught stealing cost the Athletics 1.6 wins. Based on the average cost of .046 wins, that means Rickey deserved to be caught 34 times.
His adjusted steals line is 129 stolen bases, and 34 times caught, which is worth 1.4 wins.
If you add in his pickoffs (17 actual, for a win value of -0.7 and an adjusted count of 30), Rickey's base stealing in this record-setting year was only worth 0.7 wins.
We'll come back to this season in a little while to illustrate the different ways we can measure the value of stolen bases.
How to measure value
As I mentioned above, there are two different ways to measure value; one looking at total value, and one looking at value per attempt.
Overall value we'll measure using "net stolen bases," which adjusts stolen bases based on how often a player is caught.
For rate value, we'll use win value per attempt where attempts consist of successful steals, times caught, and pickoffs.
Which stat is better? It really depends on what you're trying to measure.
Rate stats theoretically treat runners who attempt to steal at low frequencies the same as those who attempts steals at a high frequency. A counting stat like net stolen bases gives additional value each time the event occurs. If we have two players who have the same success rates (and who run in the same situations) the one who attempts more steals will score better.
Those with a high net stolen bases did more to help their team win overall (think batting runs above average), while those with a high win value per attempt contributed to more wins per attempt (think OPS+).
Let's dive into each and see what we can find.
Net Stolen Bases
Net stolen bases uses the break even concept as a way of "rewarding players for steals and penalizing them for caught stealings." The formula put forth by Rich Lederer is (SB - 2 x (CS + PO)).
This puts too much weight on pickoffs, so I'm using (SB - 2 x CS - PO) for my calculations. There's some more detail in the References section for those who are interested.
For 1982, Rickey's adjusted totals give him 31 net steals (129 SB - 2 x 34 CS - 30 PO), which is a good total, but not that impressive. It barely ranks in the top 150 seasons since 1954.
Let's first look at the top individual seasons:
|Rank||Player||Season||Net Stolen Bases|
|9. (tie)||Bert Campaneris||1969||64|
|9. (tie)||Vince Coleman||1987||64|
The most impressive result of the top 10 seasons is Bert Campaneris' in 1969. He achieved 64 adjusted net steals with only 61 actual stolen bases. His steals tended to come in very high leverage situations, which gave him the equivalent of 90 steals in average situations. The extraordinarily high value of his successful steals is coupled with a very low percentage of outs on the bases (eight equivalent caught stealing, and eight equivalent pickoffs), leading to the astounding total.
On the other end of the spectrum, the worst season according to this measure is Greg Gross's 1974 where his actual line of 12 SB, 21 CS, and 2 PO led to -47 adjusted net stolen bases.
What about for a career?
|Rank||Player||Net Stolen Bases|
There's nothing that shocking looking at the career list, although I'm slightly surprised how well Marquis Grissom fared. He combined timely stolen bases with relatively few times caught stealing.
As you'd expect, the players at the top of the real steals list are near the top of the net steals list. Lou Brock drops quite a bit from his second place ranking due to almost 300 caught stealing equivalencies. Henderson is the only person within 80 and he had 550 more attempts.
Win value per attempt
Another way to judge stolen base efficiency is to look at the average win value per attempt, Looking at it this way, Rickey's 1982 comes to only .004 wins per attempt.
The top 10 seasons according to this measure, with a minimum of 20 attempts are as follows:
You'll notice that none of the really prolific seasons make this list. That's largely related to the small sample sizes and the huge effect any given stolen base in a tight game can have on the overall win value. Also, the runners who steal a lot of bases tend to get caught more often than others might, and a single caught stealing can erase a lot of positive value.
Now for the career; this time requiring at least 100 attempts.
The career list contains a lot of the players you would expect to see at the top of a stolen base efficiency list. Of active players, both Carlos Beltran and Shane Victorino are considered excellent base stealers. Eric Davis and Tim Raines are other players who are noted for their "smart" base running and their placement on this list supports that assertion.
We've looked at two alternative ways to measure the value of a base runner's stolen base attempts. Both approaches start from the win value achieved by the player in his actual attempts and then adjusts the results in different ways to answer two different questions.
Net stolen bases can be used to demonstrate overall value from stolen bases. It closely representing the overall win value, but in an manner that scales the value to stolen bases.
Wins per attempt tries to show who the most efficient base stealers are by showing who earns the most value from each stolen base attempt.
As with any stat, these are just more tools to have at your disposal when studying player value. The key facet is not whether you choose counting or rate, but that you appreciate the adjustments for the situational stolen base, and credit or debit players correctly for that.
References and Resources
Net stolen bases is a concept coined by Rich Lederer in two articles (here and here). The goal is to measure the effectiveness of a base stealer by crediting him for his steals and debiting him for being caught.
Lederer settled on using the formula (SB - 2 x (CS + PO)) which assumes a breakeven point of 66%. It's not exact, but it's a nice shorthand and it's very easy to calculate.
The biggest issue with it is that being caught stealing and being picked off are not of equivalent value. A caught stealing is nearly twice as bad a pickoff - largely because of the situations when the occur.
So I've adjusted Lederer's formula to be (SB - 2 x CS - PO) which more accurately weights the various values over time.
The Win Expectancy and Leverage Index data is licenced from http://www.InsideTheBook.com
The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".
Dan Turkenkopf is a Yankees fan who spends way too much time poring over baseball statistics (at least according to his wife). He also writes for Beyond the Box Score and can be reached by email.