Andrew Gelman is a professor of statistics and poltical science at Columbia University, and may be most known for his fame as a writer at Nate Silver’s outstanding politics blog, Five Thirty-Eight. In a post at his own blog, Gelman answered a question on sabermetrics, and in his reply wrote the following:
There’s often a fuzzy line between (a) making inferences, and (b) simply “measuring what happened.” I mean, what’s a “save”? For that matter, what’s a “hit”? Etc. These definitions are constructed to be relevant for inferential questions about players’ abilities and contributions to the team.
One way things are changing is that there’s a ton of raw, raw data–locations of where every ball landed on the field, things like that. In that case, the steps going from raw data to inference are going to be more apparent. With old-fashioned statistics such as batting and fielding averages, it can be easier to fool yourself into thinking of them as pure measurement.
It’s easy to discredit a save as a flawed statistic that really doesn’t tell you anything. The save is an easy target; it was invented in 1959 by the late sportswriter Jerome Holtzman to give some tangible statistical credit to guys who were closing games, and despite its popularity, it has clear fundamental flaws that make it an entirely useless metric. But what about hits? When we say that Derek Jeter got a “hit,” are we at the same time not doing exactly what Holtzman and others did: giving credit to a player by using a concise and easy-to-use word to sum up what they have done on the field.
A hit “is credited to a batter when the batter safely reaches first base after hitting the ball into fair territory, without the benefit of an error or a fielder’s choice” (per wikipedia). However, with what we know about DIPS theory and the various amount of luck involved in baseball, a “hit” doesn’t much mean anything at all. That’s why we’ve begun to use batted ball data to help us understand in what fashion a player put a ball into play. We’ve already begun to divide the batted ball data into sub-categories as well: line drives, ground balls, and fly balls (and recently we’ve seen the use of “fliners).
However, while looking at how many line drives a player has is nice, it is only a step up from a “hit.” It still doesn’t tell us everything we need to know about the ball in play. Just like how all hits aren’t made equal, neither are all line drives (or gound balls, or fly balls, etc.). In fact, even the stats that we use and love aren’t perfect. The great MGL, in a post at The Book blog, criticizes Joe Posnanski for over-stating the accuracy of UZR, saying:
He is really overstating the precision with which the data is recorded and I think he knows that, or at least should. There is no way that they can differentiate between a ball hit 6 inches from the base line and 3 feet from the base line. And there is NO category that I am aware of that is a “high chopping ground ball just over the pitcher’s glove.” Come on! Which is one reason why there is so much measurement error in these metrics in the short run. A high chopping ground ball over the pitchers mound (that could easily be fielded by either the SS or 2B) could just as easily have been a ground skinner up the middle that no one could possibly have fielded. They could easily fall into the same bucket, in which case, the fielder who catches the first one will be over compensated on that ball and both the SS and 2B will be overly penalized on the second one.
This isn’t to say at all that UZR is innacurate; on the contrary, along with John Dewan’s Plus/Minus system, it’s helping to revolutionize how we rate fielders, and is far and away better than the old metrics used. But while UZR is one of the best defensive stats we have, it still follows the guidelines of how we use the term “hits,” only on a much less egregious scale. A ball hit six inches from third base is placed into the same bucket as one hit three feet away, even though both are definitely different grounders. A ten-foot roller down the third base bag is placed into the same category (“single”) as a scorching liner played on one hop by the center fielder.
So what can we do further? Well, thanks to the wonderful advancement of technology, we can look at literal, objective facts (assume little recording error/bias) about balls put in play: the force at which they were hit, the velocity, at which vector, etc. In fact, Alan Schwarz did a good job at The New York Times detailing the future of this analysis, saying:
A new camera and software system in its final testing phases will record the exact speed and location of the ball and every player on the field, allowing the most digitized of sports to be overrun anew by hundreds of innovative statistics that will rate players more accurately, almost certainly affect their compensation and perhaps alter how the game itself is played…In San Francisco, four high-resolution cameras sit on light towers 162 feet up, capturing everything that happens on the field in three dimensions and wiring it to a control room below. Software tools determine which movements are the ball, which are fielders and runners, and which are passing seagulls. More than two million meaningful location points are recorded per game.
This is the future of baseball analysis. We can then use regression to determine just how valuable a ball hit along vector 12 at 95 mph is, further enhancing our ability to evaluate players. Some may deem this as a system that is taking the fun out of the game. “When my grandfather sat me down to talk about Bobby Thomson’s homerun, he wasn’t talking about a ball hit at x mph in vector 17.” However, passion and respect for the game and its history are not mutually exclusive from in-depth analysis of what happens on the field. In fact, one could argue that advanced analysis strengthens our love and understanding of baseball.
The ultra-precise stats aren’t here quite yet; however, they are right around the corner, and they will soon become part of everyday advanced anlysis. It doesn’t mean we have to stop talking about Ichiro getting 200 hits every season, or even ignore milestone moments like 300 wins or 500 saves. It just means that when it comes to evaluating the performance of players, we’ll have the use of cutting-edge technology to help us, and what can be so bad about that?