Fielding analysis has always been like the search for the Holy Grail of Sabermetrics. We’re ever so close yet ever so far from reaching our destination. Hitting is pretty straightforward, with singles, doubles, triples, home runs, and so on all neatly categorized into buckets and credited to individual hitters. Evaluating pitching has its complications—much of which can be attributed to the interplay between pitchers and fielders—but it is still relatively “easy,” as we can spot good pitching by looking at things like strikeouts, walks, and home runs allowed.
Fielding, though, is much tougher. It shouldn’t be, right? All we really want to know is who makes the most plays, based on their opportunities. The problem is that while we know how many plays a fielder has made, we don’t have an exact measurement for that player’s opportunities. As a basic refresher course, let’s run down some of the fielding stats that have developed through the years.
Non-play-by-play fielding metrics attempt to take seasonal data and estimate fielding prowess. They take a very broad view of fielding and don’t look at things like the exact location and speed of a batted ball.
Range Factor, developed by Bill James in the late 70s, is one of the first attempts at evaluating fielding performance, outside of fielding percentage and errors. It measures, quite simply, how many plays (putouts and assists) a player makes per nine innings. Range Factor does a great job of telling us who made the most plays, but it fails to address opportunities in any way. Therefore, players who get more opportunities to field batted balls, for whatever number of reasons, will look better than they really are.
Advances were made on James’ simple construct, like Tom Tippett’s Adjusted Range Factor. As its title implies, Adjusted Range Factor makes some simple yet important adjustments in an attempt to better estimate a fielder’s opportunities. For instance, Tippett uses balls in play while the player was in the field rather than just innings, and further adjusts for things like batter-handedness and a pitching staff’s groundball tendencies.
Similar statistics were developed, but for the most part, Tippett’s Adjusted Range Factor gets us about as far as we can go while using only aggregate season data.
Play-by-play metrics use data that analyze every batted ball, such as its hit location on the field and its estimated speed. These data are collected and distributed by companies like Baseball Info Solutions and STATS Inc.
Zone Rating is the most basic of the play-by-play metrics. It breaks the field up into distinct zones and assigns zones of responsibility to each position. It then calculates how many plays a fielder makes in his zone and how many batted balls actually went through that zone to get a percentage of plays made based on chances. Then it adds in out-of-zone plays and, voila, a more precise measure of fielding ability is created. Zone Rating has its problems, but nonetheless it is an essential building block for all of the advanced systems that followed. Here is STATS zone rating grid:
MGL’s Ultimate Zone Rating (UZR), described here and here, is perhaps the most cited defensive metric today, freely available at FanGraphs. It is based on the original Zone Rating concept; however, it makes many useful adjustments to enhance its value. For one, the field is split up into many more small zones. A fielder’s performance is judged in each zone and compared to league average in that zone. Further, a bunch of other adjustments are made, including park factors, batted-ball speed, and base/out context.
John Dewan’s Plus/Minus is another one of the Goliaths of fielding stats, yet is quite similar to UZR. It is also available at FanGraphs. Plus/Minus again slices the field into vectors and credits or debits a fielder for making (or not making) plays in specific areas of the field, also considering the speed and type of the batted ball. There are plenty of differences between the two stats, but the basic constructs are very similar.
What are we missing?
As you can see from above, there’s a lot we can do with fielding statistics, advancing from basic Range Factor to Ultimate Zone Rating; from aggregate seasonal data to much more fine, granular play-by-play data. In the end, though, how close are we to truly, accurately measuring fielding performance? What are we missing?
- A benchmark for comparison. If, say, Derek Jeter is 20 runs below average every year that is fine, but how do we verify that it’s correct? Sure, it’s nice that the metric (or metrics) is consistent, but what if it is consistently wrong? How can we determine if UZR or Plus/Minus is right on the money or way off the mark, if we aren’t able to verify our measurements against something we know is right?
- Positioning. Fielders are always moving around, based on the batter, the pitcher, the count, the specific situation, and so on. If we want to measure fielding performance, wouldn’t we want to know the fielder’s initial positioning? It can be argued, perhaps correctly, that positioning is part of fielding; therefore we don’t need to know it—it’s already part of each metric, and a player gets credited for good positioning and docked for bad positioning.
That said, it is certainly something that would greatly help further our understanding of defensive performance and sort out who truly has great range, as commonly defined, versus those who position themselves optimally.
- Hang time. While “hard,” “medium,” and “soft” and line drive, fly ball, fliner, may serve as decent indicators of hang time (or a similar calculation on ground balls), they certainly fall short of an exact, precise measurement. There are developments in this area, but it certainly has been largely ignored and not included in most advanced fielding stats.
- Consistent and unbiased data. The data that go into each of these stats is collected by humans (from either Baseball Info Solutions, STATS Inc., or MLB Gameday) and is probably far from perfect. Likely, it is biased in various ways. Imagine trying to distinguish a fly ball from a “fliner” from a line drive or classify a grounder as “hard,” “medium,” or “slow” all while watching from the press box or from a television feed. Not easy, I don’t think. Furthermore, as Colin Wyers has shown on these very pages, visual discrepancies between press boxes and ballparks, amongst other things, may help to skew a stringer’s perspective.
Consider the case of Tampa Bay Rays shortstop Jason Bartlett:
|2005||Twins||+13 runs||+13 runs|
Bartlett was tremendous, by the numbers, in three seasons with the Twins. Upon moving to Tampa Bay, however, he transformed into an average, at best, defensive shortstop. It is plausible that Bartlett has simply declined, and these numbers reflect, somewhat accurately (if not precisely), Bartlett’s fading fielding ability due to age, not to mention regression to the mean. However, it also seems possible that something about moving from the Twins to the Rays—the pitching staff, ball park, stringers, teammates, etc.—has played a part in Bartlett’s fall from a great fielder to an average-to-below-average one, something that Bartlett has had no control over.
Where are we going?
Interestingly, some of the fielding stats created in recent years have seemed to have taken a step back from the likes of UZR and Plus/Minus, at least in terms of detail. The idea, however, is that if the added detail might *not* really be measuring fielding skill effectively, perhaps a more broad approach would provide increased accuracy, at least in the long term.
Peter Jensen’s Big Zone Metric uses hit-location data collected by MLB Gameday. Jensen divides the field into four zones in the infield and three zones in the outfield, with each position being responsible for one big zone. He then looks at a fielder’s performance both in and out of his zone, and makes various adjustments while comparing the player’s performance to league average.
Colin Wyers’ new Fielding Runs Above Average (nFRAA) is designed to be as factual as possible, not using any type of data (hit-location, batted-ball data, etc.) that could introduce any systematic bias. Wyers theorizes that in using this type of detailed data in an attempt to reduce measurement error, we are simultaneously introducing potential biases. We are not sure if we are actually getting more accurate or less. So instead, his approach is to be as broad and factual as possible, eliminating potential biases, while creating a measurement that can be accurate over a long period of time.
Tom Tango’s With or Without You, described in the 2008 Hardball Times Annual, takes a far different approach. For example, Tango looks at what percentage of batted balls, say, Jason Bartlett fielded when David Price was pitching, versus the percentage of all other shortstops behind Price. With large enough sample sizes, like using Derek Jeter as Tango did in his THT article, the results can be quite fascinating. The same approach can be applied to ball parks the fielder played in, hitters the fielder faced, and so on.
While these approaches offer a fresh contrast to the small-zone, play-by-play metrics covered above, I don’t believe they are adequate enough measurements for making multi-million dollar decisions that teams are often faced with. What if we only have a couple of years of data to go by to decide who the Tampa Bay Rays’ starting shortstop is going to be? While these metrics may be preferred for career value, they probably do not do a good enough job of measuring fielding ability over a relatively small sample—at least to the degree of accuracy we’d prefer.
What we really want is the detail of the play-by-play systems like UZR and Plus/Minus combined with the unbiased nature of, say, nFRAA. Enter Fieldf/x. Similar to PITCHf/x, Fieldf/x will record all of the movements that occur on the baseball field (batted balls, fielders, base runners, a swarm of midges) with high-resolution cameras installed at each stadium. Instead of relying on subjective estimates of ball speed and location, we’ll have rock-solid evidence of everything that happens on the field—a batted ball’s location, a fielder’s initial position and movement, etc. Undoubtedly, there will be plenty of issues to work through, not to mention the question of whether the data will be made public, but the possibilities of a Fieldf/x system could certainly be revolutionary. Fieldf/x could be that Holy Grail we’ve been in search of.
Currently, without any type of public Fieldf/x data, we have a lot of different fielding stats, all created with good intentions by very smart people. The issue is not so much in the metrics themselves; rather it’s in the data that go into these metrics. How much can we trust it? Do the biases cancel out over time or are they systematic? Are we still missing key elements like hang time or fielder positioning? Are the broad, less biased stats accurate enough in the short-term? These questions are not easily answered, but at the same time we must resist the urge to simply trust fielding metrics at face value because we have no better alternative. At least, we must understand the uncertainty and the potential issues at hand.
Armed with MLB.TV’s 2010 archive, in the near future, I hope to watch some games (in detailed fashion) and attempt to play the role of a stringer, classifying batted balls and trying to judge fielding performance (albeit in a very limited sample). Certainly, I won’t bring the experience and knowledge of a true stringer, but I hope that it will illuminate the process and leave me (and you) with a better understanding of what goes into many fielding stats, and how much confidence we should have in them.
References & Resources
It is almost impossible to profile every fielding metric in an article while still making it readable. Here, however, I will provide links and brief descriptions to a number of other fielding metrics, if you are interested further in the subject.
Fielding Win Shares: The fielding portion of Bill James’ Win Shares system follows a top down approach of first looking at team fielding, then crediting individual players for their performance.
Fielding Runs: Pete Palmer’s Fielding Runs was another early fielding metric that attempts to measure a fielder’s worth by looking at how many plays he made and comparing that to a number of estimated chances.
Defensive Regression Analysis: Michael Humphreys fielding stat employs regression analysis to measure fielding.
Range: THT’s David Gassko takes a crack at a non-PBP metric, using things like batted-ball data and batter-handedness to estimate how many chances a fielder had.
Total Zone: Sean Smith’s fielding metric uses Retrosheet play-by-play data, such as hit type and who fielded each hit, to estimate fielding performance.
Simple Fielding Runs: Dan Fox’s Simple Fielding Runs, similar to Total Zone, uses Retrosheet play-by-play data to analyze defense.
Probabilistic Model of Range: David Pinto’s PMR uses familiar data points such as direction of hit, hit type, and how hard a ball was hit to estimate fielding.
Spatial Aggregate Fielding Evaluation: Shane Jensen’s SAFE using a smoothing function to estimate the probability of a player making an out, using play-by-play data from Baseball Info Solutions. SAFE was further discussed, in detail, at The Book Blog.