How to evaluate hittersby Dave Studeman
January 10, 2008
Well, thank goodness that's over. It feels like we've been talking about this year's Hall of Fame election forever. Goose Gossage made it, Jim Rice, Andre Dawson and Bert Blyleven were close but no cigar and several other players I think are deserving were way off the pace. Now we have to bide our time, waiting for pitchers and catchers to report and listening to recorded phone calls for our baseball fix.
One of the things I appreciate about the Hall voting process is that many writers write about their votes. It's useful column fodder for them and it allows us to see how they are voting and why. Sometimes the writers open themselves up for ridicule, but at least we get to understand their thinking.
One thing I'm always surprised by is how differently all of us, including BBWAA members, talk about hitting. So much has been analyzed and written about good and bad hitting, you'd think we'd have reached a fairly common understanding by now. But we haven't. Some people place a lot of emphasis on batting average, others on home runs, still others on runs and RBIs. Those who have spent time analyzing the data think in terms of runs created or linear weights. There are big differences among these perspectives.
I'd like to lay out a simple outline for how to evaluate hitters. To do this, I'm going to pick three guys who played about the same time: Tony Gwynn, Tim Raines and Andre Dawson. This isn't going to be a Hall of Fame "debate," because I think all three players should be in the Hall. This is just a simple exercise to draw comparisons between them.
A lot of fans and writers intuitively understand what I'm about to step through, but a lot of other folks don't. If there were one thing I wish could be corrected in the world of baseball, ignoring pitchers' Won/Loss records would probably be my first choice, but following closely behind would be adopting a simpler way of thinking about hitting.
Here's the way I go at it—four simple steps:
Start with Total Bases
Total bases is one of my favorite stats. It's simple and understandable, yet conveys a lot of information. A guy who gets a lot of hits will get a lot of total bases. If he hits more doubles, he'll get more total bases, and if he hits more home runs, he'll get even more bases. Like I said, simple and straightforward. Year after year, the guy who leads the league in total bases is often the MVP, or close.
Plus, there are some natural benchmarks that make it a particularly informative stat. A mark of 300 total bases is very impressive; 350 might lead the league. And did you know that only one batter topped 400 total bases during the 49 years from 1948 to 1997? Jim Rice, in 1978. I think that's the most impressive "Jim Rice landmark stat" of all.
Here's how many total bases our three hitters racked up in their careers:
You can see why Dawson gets a lot of votes (especially in Chicago!) and why some people are skeptical that Raines belongs in the Hall. A difference of 1,000 bases is a lot. But as much as I enjoy counting total bases, these figures are only the beginning. In particular, we've left out two other vital contributions to scoring: walks and stolen bases. So let's correct for that.
Add walks and stolen bases
Here's a simple list of how many bases, walks (HBP's, too) and stolen bases each player accrued in his career:
Whoa. Tim Raines jumps from last to first when we add stolen bases and walks to his total. Perhaps you can see why a number of people who "look below the surface" support Raines' cause. More importantly, though, the gap between the three players starts to close.
Now, you're probably thinking something isn't quite right about this. After all, a walk just isn't worth as much as a single, and a stolen base is worth even less. I agree. Heck, a home run isn't really worth four times as much as a single, either. So we need to...
Adjust the bases
A lot of people have studied the relative value of different types of hits (and their bases). George Lindsey published a study of it in 1963. Perhaps the best-known published work was Pete Palmer and John Thorn's analysis in The Hidden Game of Baseball. More recently, Tom Ruane published an article at Retrosheet that laid out the value of all events in every year since 1960.
I have a little scale that I like to use for this purpose. It's imprecise, but I find it an easy way to remember the relative difference between, say, a home run and a walk. The scale starts at 9 (the last single digit) and touches down on every odd number, throwing in "2" right before the end. In other words, it's 9, 7, 5, 3, 2, 1. Each number stands for the relative weight of a different type of batting event.
Home Run: 9
Stolen Base: 1
These numbers don't mean anything in and of themselves; they're just relative weights. But they tell you that a home run is worth three times as much as a single, and nine times as much as a stolen base. One double is worth less than two singles, two doubles are worth the same as a home run and a stolen base, etc., etc. Lots of ways you can cut this info.
Every scale I've seen, from Lindsey to Tangotiger, is roughly the same. Yes, this is a very rough scale; these aren't the "correct" ratios. But they're like a reference chart that I carry around in my head.
When I apply these weights to the events tallied by our big three hitters, here's what I get:
Remember, these numbers don't stand for anything. They're just relative measures of how much offensive value Dawson, Raines and Gwynn contributed to their teams through their base hits, stolen bases and walks. But, as you can see, they work pretty well. Raines' stolen bases have been devalued relative to Hawk's home runs, and Gwynn's singles (the single being the best single base of all) keep him right up there with the two other hitters.
Compare to outs
Now that we've got a properly weighted set of bases, the last thing to consider is the other side of the equation: outs made. If you remember, part of the power of OBP is its opposite: 1-OBP equals outs made. The higher the OBP, the fewer the outs. This is a good thing.
To calculate outs made, I used this formula for each batter: at-bats minus hits, plus caught stealing, plus double plays batted into. Adding them up, here are the number of outs made by each player:
Andre Dawson made a lot of outs. So although we're impressed with his weighted total bases, those outs bring us down. Raines and Gwynn, with their high OBP, made fewer outs.
How can we directly compare weighted bases and outs? Well, we could divide one by the other, coming up with a ratio of weighted bases per out, like so:
Or we could give outs a weight and subtract them from weighted bases. If I use a weight of -1.6 for each out (roughly in line with the weights we used for bases), I get...
There are probably some other ways we could combine the two, but I'm going to leave it at that for now. The bottom line is that, using these four simple steps, we've devised a pretty good system for evaluating hitters, one that tells us why some people appreciate Tim Raines' batting more than Andre Dawson's—and why they're right.
Throw out what I just said. Although the general methodology is sound, the results aren't right. The details, which are largely missing from my overview, are really important. For example, once you add a decimal place to the weights, you find that they change from year to year. And a small difference in some of the weights can make a big difference in the results. Also, I should have used different weights for different types of outs (such as strikeouts and caught stealing). And I didn't adjust for era and ballpark (which had a big impact on Gwynn's totals in the '90s).
The good news is that you can find the correct results at our old friend, Baseball Reference. Click on any player's name, such as Tony Gwynn's, and scroll down to the second set of batting statistics. There you'll find a relatively recent addition to the Baseball Reference family of baseball stats: Batting Runs and Batting Wins.
Batting Runs and Batting Wins are essentially adjusted bases minus outs, except done properly and to the last decimal. They are the brainchild of Pete Palmer's, and they're listed in ESPN's Baseball Encyclopedia. You can read more about them in the Baseball Reference glossary, and here is a list of the leaders in adjusted batting runs. Gwynn is 60th, Raines is 100th and Dawson is tied for 219th.
A lot of people get confused about Batting Runs and Linear Weights and the like, and there is still a bias against them from the Bill James Abstract days. But I hope that this simple explanation of how to evaluate hitters has helped cut through some of the confusion. Just count up the bases, weight them and compare them to outs.
There's really no reason we can't all talk the same batting language. As someone has said recently, there are no red states and blue states here. Just baseball.
References and Resources
For those who want to know more, there are two fundamental complaints to this approach (that I know of). First, you shouldn't compare batters who have very different amounts of playing time this way. A batter who has played a good half season may have an advantage over one who has played a full season. Same thing with long careers. This doesn't materially affect the comparison between Dawson, Raines and Gwynn, but it can make a difference. That's why you hear so much about "replacement level" as a better baseline than "average."
Second, some people prefer a metric that is "geometric" in nature, in which the components are multiplied instead of added. Perhaps the best metric of this type, one that closely mimics the logic of Batting Runs, is Base Runs. That's why we included every player's Base Runs in this year's THT Annual.
Dave was called a "national treasure" by Rob Neyer. Seriously. Comments about this article can be sent to him through the miracle of e-mail.
<< Return to Article