10 Lessons I Have Learned about Defensive Statistics

Assigning credit for a ball where multiple fielders are present can be tricky (via Keith Allison & Howell Media Solutions).

Assigning credit for a ball where multiple fielders are present can be tricky (via Keith Allison & Howell Media Solutions).

Editor’s Note: This is the fourth post of “10 Lessons Week!” For more info, click here.

Like many baseball analysts of my generation, my sabermetric interest was inspired by several revolutionary books in the 1980s. Most of you are probably familiar with them. When I was in my 20s, I voraciously read the Bill James Abstracts, the Elias Baseball Analysts, Pete Palmer’s and John Thorn’s The Hidden Game of Baseball, and Craig Wright’s The Diamond Appraised. In the 1990s, I also started to devour a few other publications, like Mike Gimbel’s Player and Team Ratings, and STATS Annual Baseball Scoreboards (as well as the STATS–later Bill James’–Handbooks).

Lesson #1: The People Behind Retrosheet Are Saints

After whetting my sabermetric appetite with those publications, I began doing a lot of my own research on evaluating offense and defense, spring boarding off the work of many of the original pioneers like James and Palmer. I originally used play-by-play and batted ball location data provided by a volunteer organization of baseball fans and statheads who collected this data at the ballpark and from television broadcasts. The group was originally called Project Scoresheet, then The Baseball Workshop, and now Retrosheet. Some of the founders and principals of these organizations are Sherri Nichols, Dave Smith, and Gary Gillette, along with Pete Palmer, a long-time colleague and collaborator of Gillette. All of us owe a great deal of gratitude to these early data collectors and to those who currently compile and disseminate (for free) the Retrosheet data.

Lesson #2: Advanced Defensive Metrics Have Been a Long Time Coming

One of the first things I developed in the mid-’80s was a zone-based defensive metric (I don’t know that I called it anything in particular at the time), using the Project Scoresheet batted-ball location data. At around the same time, I had heard that Sherri Nichols and Pete Decoursey, two other early baseball researchers, were doing the same thing. They called their metric defensive average, not to be confused with the traditional fielding average.

A few years later, STATS, in 1989 or 1990, came out with its own Zone Rating (ZR) and presented it in its first annual Baseball Scoreboard. I think that all of us developed our own version of “zone rating” independently, as neither Nichols’ nor my work was disseminated broadly, and in fact, few people knew of their existence. Remember, this was all pre-internet, or at the very least, at the beginning of the internet, when a lot of baseball research was being shared and discussed on Usenet and other little-known “electronic bulletin boards” and the like.

Around 2000, I think, STATS came up with an Ultimate Zone Rating, whereby it assigned different values to catches or non-catches in various locations on the field for each fielder, rather than using one single zone for each fielder (and some shared zones). The assumption was that not every ball in a fielder’s zone was equally difficult to catch even though ZR treated them all the same. That might seem obvious today, but as with every new discovery or invention, it was apparently not so obvious at the time, and was considered somewhat of a breakthrough in defensive evaluation–at least by me.

For some reason, STATS abandoned this methodology after its initial presentation in the Scoreboard, and it was never heard from again, until John Dewan resurrected a modern, more advanced version, the plus-minus (PM), and eventually defensive runs saved (DRS), with BIS almost 10 years later. So the credit for the original Ultimate Zone Rating, goes to STATS and not to me. I loved the concept and enthusiastically ran with it. I also kept the name, which was eventually shortened to UZR.

My work on UZR was never intended to be and never did become a commercial endeavor. I have spent hundreds of hours writing on, researching, and coding various incarnations of UZR over the years, and tens of thousands of dollars purchasing more advanced (than that provided by Retrosheet and the early data providers) hit location data. The only remuneration I have ever gotten is a small licensing fee from FanGraphs for the last few years. So, if anyone wants to accuse me of stealing the idea from STATS (not that anyone ever has, I don’t think), they would be completely justified. Basically, I loved the idea and decided to refine it. As they say, imitation is the greatest form of flattery!

For some reason, my version of UZR has gained a lot of traction over the years and is often considered the de facto modern sabermetric defensive metric, despite the fact that there are many equally good and similar ones, including John Dewan’s DRS, David Pinto’s PMR, Humphrey’s DRA, Shane Jensen’s SAFE, Sean Smith’s Total Zone, and others. I have to give John Dewan and everyone else at the original STATS company a lot of credit for never claiming that UZR was their original idea (which it was). In fact, I owe a lot of my early sabermetric inspiration to those STATS Scoreboards. They were wonderful publications filled with hundreds of well-researched and interesting questions about offense, defense, pitching, and other aspects of the game of baseball. I remember being pretty devastated when they ceased publication after the 2001 issue, I think.

By the way, if you want to read a very good, comprehensive history of defensive metrics, including UZR, I highly recommend this 2010 article by Dan Basco and Jeff Zimmerman from the SABR Baseball Research Journal.

Lesson #3: Asking the Right Question(s) Is Important

In order to understand UZR and defensive metrics in general, it is important to first ask the proper questions. In fact, the proper question is the most important thing when it comes to crafting any effective metric. One has to be perfectly clear what one is trying to capture in order for the metric to be any good, and the methodology has to do an adequate job in answering it.

Lesson #4: Defense Is Simple

So what are we trying to measure with UZR and how do we do it? Obviously we are trying to measure the quality of a fielder’s defensive performance, but what does that mean in the context of having access to certain batted ball data? Compared to hitting and pitching, surprisingly, evaluating defense is a lot easier–in a sense.

When a ball is put into play and is not a home run, there are only two things that can happen from a defensive perspective: Either the ball is turned into an out, or it falls for a hit or an error. Obviously, there are all kinds of other things that can happen after a batter reaches base safely or even if the batter is retired, which involve defense, but the bulk of the work implicated in evaluating defense starts and ends when the batter puts the ball in play and is either retired or reaches base. Even the value of the hit (single, double, etc.) doesn’t have a lot to do with the fielder if he isn’t able to turn the batted ball into an out. So, for now, we will focus on a binary outcome: Either a fielder turns a batted ball into an out, via a fly ball catch or a ground ball out at first or another base, or the batter reaches safely on a hit or an error. Seems easy, huh? Well, not really, as it turns out.

Lesson #5: Defense Is Far from Simple

Here is the key question when it comes to just about any good defensive metric, even the theoretical perfect one: Given the nature and location of each batted ball, how likely is it that an average fielder at each position would turn it into an out? If we knew the answer to that simple question, our job would be almost over and the results would be near-perfect. Let’s say that for a certain batted ball, say a fly ball to a certain location in center field, the answer to that question was “zero” for all fielders other than the center fielder, and for him it was 80 percent. First of all, how do we know those numbers? That’s simple too. We look at all such balls over some lengthy period of time, say five years, and we count how often each fielder catches each one of them and how often they don’t. In our example, no one but the center fielder ever catches that type and location of fly ball, and the center fielder catches it 80 percent of the time–a pretty routine fly ball that presumably all the but the slowest or worst center fielders are able to catch (or those who are “out of position” for various reasons, which we shall discuss later).

Once we know those numbers, our job is almost over. If a certain center fielder is on the field when that exact same batted ball is launched into the outfield, and he catches it, he gets credit for 20 percent of a catch. If he misses it, he gets debited 80 percent of a catch. A catch is worth the average value of an out which is around .25 runs in the current low run environment, plus the average value of a hit for that particular batted ball, which varies depending on its type and location. For example, a deep fly ball which is not caught might result in an extra base hit 75 percent of the time and a single 25 percent of the time, and a short fly ball not caught might result in a single 75 percent of the time and an extra base hit 25 percent of the time.

Let’s say that in our example, the average hit value of our batted ball is .6 runs, a little more than the value of a single. So, a catch, as compared to a non-catch, is worth .25 runs plus .6 runs, or .85 runs. A center fielder who catches the ball gets credit for 20 percent of .85 runs. When he doesn’t catch it, he is debited 80 percent of .85 runs. If our center fielder catches that ball 80 percent of the time, as often as an average center fielder, he would get credited .2 * .85, or .17 runs 80 percent  of the time, and minus .8 * .85, or – .68 runs 20 percent of the time, which, lo and behold, adds up to exactly zero runs! A fielder who makes plays as often as an average fielder at that position, must by definition have a UZR of zero.

Again, that seems like a simple, effective system. Once we know the type and location of a batted ball and how often each fielder catches it over the course of a season, we can tally up all of a fielder’s pluses and minuses and the result is a near-perfect accounting of a fielder’s performance, just like linear weights or wOBA for batters. Unfortunately, what seems like a simple answer to a simple question, yielding a perfect metric, turns out to be nothing of the sort. And here is where my 15-year headache begins.

Lesson #6: Multiple Fielders Complicate Things

The first of many problems I encountered was, “What to do with balls that are caught by one fielder and could have haven caught by another.” Many batted balls hit to certain areas of the field are either turned into an out or not by one and only one fielder (at least during the pre-shift era) – for example a ground ball hit down the first or third base line. However, lots of balls are hit in areas in which at least two fielders can and do turn those batted ball into outs. For example, a fly ball or line drive in the left field gap might have a hit rate of 50 percent, a catch rate by the center fielder of 25 percent and a catch rate of 25 percent by the left fielder. That seems straightforward enough, at least on a hit, where we simply dock both of those fielders .25 balls each, or around .225 runs (25 percent of the sum of .65 for the hit and .25 for the out). But, what if the ball is caught? The fielder who catches the ball gets credit for half of a caught ball, or around .45 runs. What about the other fielder? For accounting purposes, we must dock him .25 balls, the same as if no one had caught the ball.

So, in that situation we end up penalizing each fielder the same amount whether no one catches the ball or the other fielder catches the ball. That doesn’t seem right, does it? Surely when the center fielder catches the ball, there is some chance that the left fielder could have caught the ball too, and vice versa. This situation often occurs when one fielder is a “ball hog,” that is, he tends to catch the majority of the batted balls that are catchable by more than one fielder (usually only one other).

Lesson #7: Errors and Hits Are Not the Same

Another thing that I had to consider which, surprisingly, turned out to be a quagmire (a common theme in developing and refining UZR), was how to handle errors. Originally, in the early zone rating type metrics, errors were treated exactly the same as hits. After all, whether a batter reaches on an error or a hit makes little difference as far as potential run scoring is concerned, assuming the batter and runners end up on the same base(s). Remember, defensive metrics, even the most advanced ones, are primarily concerned with a binary outcome – either safe or out. Even most of the advanced metrics, like Dewan’s PM and Humphrey’s DRA, continued to treat errors the same as hits (other than perhaps the fact that their average run values might be slightly different).

I think I may have done the same thing when I designed my original simple zone rating system, but in thinking about UZR back in the ’90s, it occurred to me that errors and hits were very different in one regard even though their outcomes were very much the same. When a batted ball fell for a hit, we docked one or more fielders some percentage of a catch, depending on how often that batted ball type and location was normally turned into an out by each fielder. For example, if a ground ball in the shortstop hole is normally fielded by the shortstop 20 percent of the time, a difficult play, a shortstop gets docked 20 percent of a catch if he doesn’t make the play. What about that same batted ball that is scored an error? As I said, most defensive metrics treat it as if it were a hit, docking the shortstop the same 20 percent of a catch.

It occurred to me that the fact that the scorer deemed the play an error, by definition meant that the play could not have been a difficult one to make. So why are we docking the shortstop only 20 percent of a catch as if it were a tough play to make? I ended up using a different accounting method when a ball was scored as an error rather than a hit–one that  assumes a relatively easy play, such that the fielder who makes the error is debited a larger percentage of a “play” than if the ball were a hit. Several analysts in the sabermetric community disagree with the way I handle errors. Some of their arguments are not unreasonable. Unfortunately, a full discussion of the “error issue” is beyond the scope of this article.

Now, how is it possible that a batted ball can be of the same type and in the same location, yet be a difficult play one time and an error, or presumably an easy play, another time? Well, even though the answer to that is fairly obvious, it opens up a gigantic can of worms, and exposes us to the most problematic aspect of any advanced defensive metric, including UZR–positioning of the fielders, and the quality of the data.

Lesson #8: Park Factors Are Hard

Before we get to that, let’s talk a little about some of the more mundane issues I had to deal with, which also turned out to be somewhat problematic. I don’t know to what extent the other advanced defensive metrics handle park factors, but to me, they are quite important. Surely you can’t use league average catch rates for left fielders at Fenway or the vast expanses of Coors Field or even the short porches in Yankee Stadium and Minute Made Field, among other quirky parks. As well, ground balls in Denver and Arizona scoot through the infield like a hockey puck on ice, whereas they get eaten up by the tall infield grass at Wrigley.

Again, using park factors to “neutralize” catch rates in the infield and outfield at the various parks seems pretty straightforward, but trust me, it’s not. For one thing, in the outfield, not all locations can be treated alike, even if we narrow the park factors to something like left field, center field and right field. For example, a short fly ball in left field at Fenway is more likely to be caught by an average fielder than one in Coors Field or Yankee Stadium, as the left fielder playing in front of the Green Monster is going to be normally stationed 10 or 20 feet shallower than in parks with a more expansive left field. And of course a long fly ball that might be a can of corn in most parks could be off the wall and uncatchable in left field  at Fenway or in Cleveland, or in right field in Orioles Park. But where do we draw the distinction between a short and long fly ball, or left, center and right, to apply the appropriate park factors?

And what should those park factors be? We know that when computing and then applying park factors, for offense, pitching, or defense (e.g., UZR), we must regress the observed splits toward some mean or “league average.” If we know little or nothing about how or why a park might reasonably affect the stats we are adjusting, then we must assume that the mean is neutral and unbiased (i.e., a park factor of 100). However, for balls hit to left field at Fenway, or anywhere in the outfield at Coors, or ground balls through the hard and fast infields of Chase and Coors Field, we do know something even without observing the numbers.

Still, how do we establish those “means?” For example, if we observed that 70 percent of balls hit to a certain area in short left field are caught at Fenway, but only 50 percent are caught by the same fielders at all other parks, but our data cover  only one year, what should we use as our park factor? Normally, we wouldn’t use the observed ratio, 7/5, or a PF of 140, because our sample size is so small–we have to regress that 140 heavily towards some mean, usually 100 (if we knew nothing about the park). But, we expect that balls hit to short left field at Fenway should be caught more often, even without looking at any of the numbers. So, what do we regress the observed 140 percent increase toward? 110? 120? 150? Honestly, I have no idea. Again, there is probably a mathematically rigorous solution to this kind of problem, but unfortunately, it is above my pay grade!

Lesson #9: Fielders Are Everywhere

What about fielder positioning based on the outs and base runners and even the batters at the plate? If the batter is fast, the infielders are going to be playing shallower than if the batter were slow. Surely that affects their catch rates on ground balls–if you are forced to play shallow, hard hit ground balls are more likely to scoot through the infield. Same thing with outfielders and batter power. The more power the batter has, the deeper the outfielders must play. So I ended up grouping batters into various categories, slow, medium, and fast, low, medium, and high power, and then putting their batted balls into separate “buckets.” Again, not a particularly elegant solution, but, as they say, it’s good enough for government work!

I also faced a similar problem with respect to base runners and outs. Certain fielders would be positioned differently, depending on what runners were on what base, and to some extent, the number of outs. For example, with a runner on first and no outs, the first baseman almost always starts out on the bag, holding the runner, therefore limiting his range in the first base hole. At the same time, with less than two outs, the second baseman and shortstop are a bit shallower and pinched in towards the second base bag, in what is called “double play depth.” In some cases, with no outs, or with one out and a pitcher at bat, the third baseman is playing up in anticipation of a bunt. Again, these non-standard fielder positions must be accounted for in the UZR “engine.” And let’s not even talk about “shifts” which have been occurring at higher and higher rates around the league starting in around 2012 or 2013. Currently, UZR completely ignores batted balls that are affected by a shift.

As you can see, fielder positioning is a critical part of ascertaining how often a fielder “should” turn a certain batted ball into an out. To some small extent, it is actually part of a fielder’s skill–at least that is the refrain you often hear. For example, Cal Ripken Jr., an excellent fielding shortstop in his heyday, despite not being fleet of foot, was considered particularly adept at being in the “right place at the right time.” However, in the grand scheme of things, fielder positioning is much more a function of the batter and the game situation (and the manager and coaches), and thus a good fielding metric must try and somehow estimate and account for where each fielder might be positioned at any point in the game, based on the batter, pitcher, base runners, outs, score, inning, etc. As you can clearly see, this is not such an easy task.

Lesson #10: We Are at the Mercy of the Data

Finally, we get to perhaps the stickiest of problems in crafting and calibrating a batted-ball-based defensive metric like UZR–the quality of the data. Ideally, we want to know two things when a ball is put into play: One, the exact starting position of each relevant fielder, and two, the exact location and character of the batted ball. We’ve already discussed the first part and how we can estimate that. The second part is even more difficult and even under the best of circumstances can only be done on an aggregated basis.

For example, originally, with Project Scoresheet, the data recorded only four types of batted balls–fly balls, pop flies, line drives, and ground balls, and the recorded location was maybe a 10 foot by 20 foot swath of the field. Obviously not all fly balls or line drives are created equal, and all three “air ball” categories overlapped to some extent. As well, there were biases in the recording of the data, depending on the “stringer” (the person who does the data recording), the park, whether the game was being watched on television or in person, and even the players on the field (something called “range bias,” whereby the “rangier” the fielder, the closer to his original positioning the stringers would tend to record the location of the ball). Later on, with BIS and STATS, and other data companies, we had access to more granular data, such as the “exact” location of each batted ball, a few more “type” categories, such as fly ball and line drive “fliners” (in between a line drive and a fly ball), the speed of the ball, soft, medium, and hard, and in the last few years, the actual air or ground time, in tenths of a second. Needless to say, even with these tremendously helpful details, they are all only approximations and we are still at the mercy of the stringers’ judgment.

As you can see, while the premise of an advanced batted ball fielding metric seems relatively straightforward–how often is each type and location of batted ball fielder by position X, and did fielder Y make the play or not?–it is hardly that. Our primary impediments are fielder positioning and the quality of the data. The best we can do is approximate both, given the data we have access to. Then there are issues of ball hogging (fielder interactions) and park effects to contend with. With the introduction of some of the newer, computer and video-based data recording techniques, like HITf/x and FIELDf/x we are getting closer and closer to the holy grail of fielding metrics. In the meantime, despite the gloom and doom you have just been introduced to, current advanced fielding metrics are actually quite good. Once we have a season or two of data, I think that they give us a relatively accurate indication of fielding talent and historical value. The methods employed by UZR and other defensive stats may seem a bit convoluted and even somewhat jury-rigged, but as Bill James once said, the messy truth is better than a tidy lie.

Print Friendly
 Share on Facebook0Tweet about this on Twitter2Share on Google+0Share on Reddit0Email this to someone
« Previous: 10 Lessons I Have Learned about Creating a Projection System
Next: 10 Lessons I Learned About the Baseball Economy »

Comments

  1. jim S. said...

    “It occurred to me that the fact that the scorer deemed the play an error, by definition meant that the play could not have been a difficult one to make.”

    By definition? By WHOSE definition? The official scorer? One guy? Seriously?

    It seems to me that evaluating the difficulty of a fielding play is by far the most difficult problem, which includes all the other variables you mentioned, such as positioning, park factors, game situation, etc. Here is my solution: use experienced human eyes. I would have a group of SABR volunteers (say 10 or 12 for each play) grade every play on a score of zero to 10; of 10 major-league shortstops (for example), how many — in that position and in that game situation — make this play? Drop out the highest and lowest grades (or not, since this is a math question I’m not qualified to answer) and, voila, you have a pretty decent difficulty score. There are all kinds of other evaluations, such as an infielder’s throw to first and did the firstbaseman have to scoop or leap for the throw. But that’s the basic idea.

    • Tangotiger said...

      If I had 10 or 12 volunteers for each game, I would do far more than that.

      ***

      The problem is that we are limited to the data in hand. MGL’s point is that given that you have one official scorer, then it’s more likely than not that the error play was easier to make than the hit play. We don’t have to conclude it was a 100% out play, but at least, it was probably (somewhat) easier to get an out on that play than the hit play.

      • Gabriel said...

        Seems like it ought to be possible to create a mobile app for fans to provide their scouting report of each individual play as they watch the game. Just like there are fans who are happy to go to games and keep score, I’m sure there’d be fans who’d be happy to tap on their iPhone what they think the difficulty of the last play was and whether they thought the fielder did well (and whatever other questions were appropriate). The point would be that instead of getting a handful of volunteers, you just put the app out there and encourage people to use it.

        Now I’m no programmer (and I don’t even have a smartphone), but certainly this seems feasible. Think of it as the “Yelp” of plays in the field.

    • Oh Beepy said...

      Are you aware of how many plays there are in baseball each year?

      Do you realize how incredible it is that even one person sat there and classified each one manually? Can you comprehend the amount of resources required to have ’10-12 Experiended SABR Volunteers’ review each and every play in major league baseball?

      • said...

        It can be done, I think, especially as a crowdsourced type of thing. Believe or not, as I said, several of the companies are doing that and while they may not have 10-12 people watching each play, I think they have more than one person watching every play.

      • jim S. said...

        Hey, wait a minute guys; I had a plan. Not saying it would necessarily work, but here goes:

        There are about 94 MLB games per week during the season. With, say, 10 evaluators per game, that’s 940 pairs of eyeballs per week. Now, at SABR, we have a lot of old, retired guys like myself with some time on their hands. So we’d form a committee of, say, 60 guys would would volunteer to watch about 16 games each per week. Then we’d work out a deal with MLBAM for a full version of each of its condensed games to be available to the evaluators. An average full condensed game might run 18 minutes, so we’ll say each game might require 45 minutes to analyze. So that’s 12 hours per week per evaluator — and less with more than 60.

        Interestingly, I proposed this idea to Vince Gennaro, president of SABR, and his response was, well, John Dewan at BIS is already doing something like this. Well, we can be pretty sure that Dewan isn’t getting too many sets of eyeballs on each play for his proprietary numbers. In fact, I asked him this very question at Phoenix this March and his response was “Fifty.” I’m sure he wasn’t serious, but he DID sound overly sensitive.

        And, Tango, what WOULD you do with 10-12 volunteers per game. I’ve love to hear some ideas.

      • Catoblepas said...

        I believe this is almost exactly what Inside Edge fielding data is! They have scouts watching every play and defining them as “Impossible” (0%), “Remote” (1-10%), “Unlikely” (10-40%), “Even” (40-60%), “Likely” (60-90%), or “Routine” (90-100%). You can go to a player page on Fangraphs and see what percentage of the plays in each given bucket they make. http://www.fangraphs.com/blogs/inside-edge-fielding-data/

    • said...

      First of all the argument, that an error is not necessarily an easy play, because it is subject to the whims of the scorer is a poor one. We only care about aggregates when it comes to virtually any aspect of any metric. You can say the same thing about offense. A hit is sometimes a duck snort to the OF or a bleeder through the IF, and an out is sometimes a screaming line drive. But, in the aggregate, hits are better struck balls than outs, and that is why they have predictive value (at least for batters).

      So while an error is sometimes an easy play or a bad throw on a routine ground ball, and other times it is somewhere between a hit and an error (and a hit is often really a botched easy play and should be an error), clearly on the average, an error is a much easier chance than a clean hit. And that’s all we care about.

      That being said, yes, of course a subjective evaluation of the result of a batted ball with respect to the fielder is great additional piece of data, perhaps the only piece of data that you need. Some of the data providers, like BIS and Inside Edge are starting to provide that and that will be quite helpful in constructing defensive metrics.

  2. tz said...

    Mitchel, thanks for all these insights. I really enjoyed the article.

    Do you know if anyone tracks the newer fielding metrics by batter? For example, is there a way to find the net UZR on the balls that Mike Trout has put into play?

    • said...

      Well, we have the raw data, so that can easily be done. The problem with that is you will probably find that the batters that hit the ball harder (like Trout I assume) will likely have a net negative UZR, since UZR and other defensive metrics can only approximate the speed (and location) of the batted ball.

      Interestingly we don’t find that with pitchers, with is commensurate with what we know of DIPS. Pitchers simply do not allow hard or soft hit batted balls as part of their skill set. If a pitcher has a positive or negative “UZR” in one year, he is just as likely to have the same or opposite value the next year.

  3. Randy Poffo said...

    I love first-person accounts wherein groundbreakers take us readers through the thinking they were doing while they were doing their groundbreaking. Thank you, MGL; my respect for what you and Tom Tango have done is off the charts.

    Any comment on Field f/x, or what(ever) passes for it these days?

    • MGL said...

      The new system that tracks virtually everything that moves on the field is going to be a boon for defensive evaluation. Then again, once you combine scouting with even a simple defensive metric like Total Zone, you already have a pretty good idea as to the value of a fielder after a season or two. So it’s not like field f/x is going to be a Moneyball breakthrough.

  4. yeah said...

    So, an advanced metric like UZR only takes into account an outfielder’s ability to catch fly balls and nothing else? It doesn’t care about preventing extra bases or throwing out runners who try to tag up? I know what you do is very complex already but this still seems like an overly simplistic representation that might in fact be misleading at times.

    For example, if an OF throws out 10 runners at the plate in a year (hypothetically!) that would add a lot of value to the team (could potentially even add multiple wins to the team).

    It seems like, to me, despite all of the hard work by you and others it is still a science in its infancy….

    • MGL said...

      Yes, there is a separate part of UZR which tracks how often outfielders prevent or allow extra bases on hits, fly balls, ground balls, etc., and how many assists they make at the bases, based on the location and type of batted ball, the outs, etc. on fangraphs, it is under “arms.”

    • yeah said...

      Thanks guys! Looks great, those top guys all have cannons so that is nice to see.

      Is arm a counting stat? If so, would Arm / IP be a relevant stat?

      Does Fangraphs UZR/150 include all defensive metrics (range, arm, error)?

  5. DrDave said...

    Hey, great to hear some recognition of Sherri Nichols and Pete Decoursey again — that was truly pioneering work. I remember devouring my paper copy of “The Philadelphia Baseball File” that introduced Defense Average, back in… dear God, can it really have been 1989? Wow.

  6. pft said...

    Good stuff, you can certainly see the challenges and the shifting that goes on now must make it more problematic.

    As someone who loves looking at fielding stats I am a bit frustrated at not being able to see how each play is handled by the metrics as a form of validation. Lacking that, and I can understand why it may not be possible to have play by play, it would be nice to have game summaries, or at the very least home-away splits. Of course, I pay nothing so who cares, but some of us would pay like we do for BRPI.

    Having to do with a single number which is simply a cumulative number of of a seasons worth of play and no way to validate them leaves the user no choice but to accept them as a matter of faith or reject them all as skeptics . Sometimes we do both and use the numbers we like and agree with and throw away those we don’t.

  7. Steve said...

    Thank you. This was an incredible read, and possibly the best series, overall, that I’ve read at Hardball Times. Thanks for the insight into defensive metrics.

  8. Dave said...

    Someday, we’ll be watching a game at home, and from the crack of the bat to the pitcher-batter camera cutting to the outfield camera, the stat will pop up of what % of the time the play will be made…but we’re not there yet.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>