Fielding Stats at the Hardball Times

We’ve just added something new to our array of 2007 baseball statistics: Revised Zone Rating, updated daily as defined by Baseball Info Solutions. Before I explain what makes our Revised Zone Ratings special, I’d like to go over the “big picture” of fielding stats. Can’t help myself; I’m a big picture guy.

The big picture is simple: Good fielders turn batted balls into outs more often than bad fielders, right? Well, last year, there were 133,200 balls put into play by batters (excluding home runs, but including foul balls that were caught for outs) and 91,848 of them were caught for at least one out (some were double plays and one was a triple play). So, 69% of all batted balls were fielded for outs. In the 1980′s, Bill James took this calculation and turned it into a useful statistic, called Defensive Efficiency Ratio (or DER).

The major league average DER last year was .690, based on how THT calculates it (there are several different ways to calculate DER, but that’s a discussion for another day). You might consider DER the most fundamental fielding stat there is, but it isn’t a clean fielding stat. For one thing, some balls are easy to field, while others are kind of hard to get to. Here’s a list of how often each type of batted ball occurred:

  • Nearly half (about 60,000) of last year’s batted balls were ground balls; 74% were turned into outs, ranging from 70% to 78% between teams.
  • 33,000 were outfield flies; 89% were caught for outs; teams ranged from 87% to 91%.
  • 19,000 were line drives; 25% of those were outs (range of 22% to 28%)
  • A little over 13,000 were “fliners” (between a line drive and a fly ball), a batted ball type collected only by BIS; 46% of those were turned into outs, with ranges from 42% to 54%.
  • Slightly more than 5,000 were infield flies (99% of those were outs, with very little difference between teams).
  • And there were about 3,000 bunts last year; 79% were outs (range from 64% to 87%).

As you can imagine, the type of batted ball allowed by each pitching staff has a good-sized impact on that team’s DER.

Some pitching staffs induce more infield flies (the most last year was 213, by Oakland; the smallest number of infield flies allowed was 142, by the Pirates) while others allow more line drives (699, the Royals vs. 564, the Angels). You shouldn’t really judge fielders as if their pitching staffs all allow the same number of batted ball types, and that’s why THT has been posting its team “plus/minus” stats for the past year.

Our team “plus/minus” stat is an improvement over DER, because it corrects for the type of batted ball allowed. In other words, it adjusts the thinking behind DER by the number of infield flies, line drives and all other types of batted balls allowed by each pitching staff. I explained the system in more detail when we rolled it out last year.

In addition to adjusting DER, the “plus/minus” stat expresses each team’s fielding performance as the number of plays above and below the expected number of plays an average team would turn. This is a more concrete and useful way of thinking about fielding.

We took the “plus/minus” format from John Dewan’s Fielding Bible, which was published over a year ago. John, who is the majority owner of BIS, took a very detailed approach to fielding analysis and developed a “plus/minus” statistic to quantify the performance of each individual fielder. His system, which is very similar to MGL’s Ultimate Zone Rating (UZR) and David Pinto’s Probabilistic Model of Range (PMR) (which were rolled out several years prior to the publication of the Fielding Bible), includes not only the type of batted ball involved, but also how hard it was hit and where it was hit.

In the last two Hardball Times Annuals (2006 and 2007), John was nice enough to publish his plus/minus stats collated for each major league team. Of the three team general fielding stats (DER, THT’s plus/minus stats and John’s plus/minus stats), John’s are clearly the best because they include the most adjustments. John also applied his methodology to individual players (his team stats are the collation of individual players’ stats), which you can peruse in the Fielding Bible or the most recent Bill James Handbook.

If you want to better understand the process behind categorizing the location of a batted ball, be sure to read John Walsh’s two basic, yet excellent articles on the subject: Infield Defense—Back to Basics and Infield Defense Part 2—The Next Step. The latter article looks at the question of adjusting the fielding stats for left-handed and right-handed batters; something that UZR does but the Fielding Bible doesn’t do.

Anyway, that’s not what I want to talk about. I want to talk about Revised Zone Rating.

Zone Rating was invented by John when he was at his old company, STATs, Inc., in the late 1980s. I forget which book first published it, but I associate Zone Rating with STATS’ great Baseball Scoreboard series of the 1990′s. The central idea is to evaluate the fielding of individual players by analyzing only those zones in which the average fielder at that player’s position fields at least 50% of balls for outs. This method allows you to split the playing field between fielders and assign responsibility for many batted balls.

There were about 71,500 balls hits into zones last year (not including pitchers’ zones), out of the 133,200 batted balls I mentioned before. That’s partly because certain types of batted balls are excluded. Zone Rating doesn’t include bunts, for instance, or infield flies. For infielders, Zone Rating only includes ground balls that travel at least 69 feet (for middle infielders) or 59 feet (for corner infielders). And many batted balls don’t fall into the “50% zones” at all.

Here’s a list of the number of balls that have been hit into each position’s zone this year (through Sunday’s games), the number of plays made on those balls, and the subsequent average Zone Rating of each position:

POS      Balls in Zone  Plays Made  Zone Rating
1B            2,779        2,083       0.750
2B            5,113        4,292       0.839
3B            4,518        3,093       0.685
SS            5,281        4,349       0.824
LF            3,825        3,264       0.853
CF            5,100        4,570       0.896
RF            3,941        3,428       0.870
Total        30,557       25,079       0.821

The position that really sticks out is third base, where the average Zone Rating is much lower than that for other positions. I assume this means that the third base zone is relatively wider for third basemen, with several zones close to the 50% cutoff. But I don’t really know what’s behind the third base numbers.

When fielders make plays on balls outside their zones, they’re given credit for an “Out of Zone” play. In the STATS Zone Rating metric, still published by ESPN, among others, plays made out of the zone are included in a player’s numerator (the “Plays Made” total) and denominator (the “Balls in Zone”) in order to give the player credit for wider range. There are all sorts of problems with this approach, as identified by JC Bradbury and Chris Dial.

At Baseball Info Solutions, John decided to rectify his earlier mistake and simply list plays made outside of the zone separately from the Revised Zone Rating. For instance, here is a table of the total number of plays made outside each position’s zone (based on this year’s numbers):

POS        Balls in Zone Plays Made    Plays Outside Zone
1B            2,779        2,083                 425
2B            5,113        4,292                 654
3B            4,518        3,093                 744
SS            5,281        4,349                 816
LF            3,825        3,264                 686
CF            5,100        4,570                 867
RF            3,941        3,428                 688
Total        30,557       25,079               4,880

Overall, out-of-zone plays are about 20% of those in zone, though there are some differences between positions. Remember, this isn’t a reflection of the fielders; it’s a reflection of how the zones are drawn for each position.

This improved version of Zone Rating (we now call it Revised Zone Rating) is what The Hardball Times now carries on its website, updated daily. You can’t find this data anywhere else on the web.

The best fielding evaluation systems are those that don’t split the field into zones, but handle each zone as part of a broader continuum with position responsibility overlapping in some zones. UZR, PMR and BIS’s plus/minus system all do this; Revised Zone Rating doesn’t. So why aren’t we publishing those stats? Because we can’t afford them. (subliminal message: buy more THT stuff!) But we can run a little research to see how well Revised Zone Rating performs.

I calculated the correlation between each of the team fielding measures described at the beginning of this article with the Fielding Bible plus/minus system, and here’s what I found (2006 season only; an R-squared of one is a perfect correlation):

  • DER had an R-squared of about .3 with the plus/minus system. Not terrible, but not terribly good. When I looked at this a year ago, I found an R-squared of .5. Don’t know what happened in 2006.
  • THT’s plus/minus system achieved an R-squared of about .5. A big improvement, but a lot of room to improve further.
  • Then I took all the Revised Zone Rating variables (balls in zone, plays in zone and plays out of zone) for each team and compared that to the plus/minus system and found a correlation of .7. Better still. Not perfect, but pretty good

I also found that the regression equation for Revised Zone Rating stats makes a lot of sense. To approximate the plus/minus system, here’s what you do:

  • Multiply the number of balls in zone by a negative .88 (-.88). This means that an unfielded ball in a zone is worth almost an entire “minus.”
  • Multiply each fielded ball in a zone by .96. The impact of a fielded ball in zone, then, is a plus .08 (.96 minus .88). Balls fielded well in the zone help, but not a lot. Those are supposed to be fielded well.
  • Multiply each ball fielded outside the zone by .48. A ball fielded outside a zone has a much bigger impact on the plus/minus system, and it should. The impact is six times greater than that of a ball fielded in zone.

If you think about it, those weights make a lot of intuitive sense. The big difference in weights between in- and out-of-zone balls also reinforces the need to consider them separately, instead of combining them into one stat.

The equation works better for infielders than outfielders, presumably because infield Revised Zone Rating is based on only ground balls, while outfielders are judged on fly balls, line drives and fliners. In fact, I don’t suggest that you apply the math to individual players, because the equation will differ considerably from position to position. And I personally like breaking out the stats, because that gives you more information.

So that’s the current story of THT’s fielding stats. Remember, Revised Zone Rating is only a measure of fielding range. There are other things to consider when judging fielders, such as the ability to turn the double play, handle bunts, throw out runners, back up plays, etc. We’re still a far way from the “ultimate” fielding statistic. But it’s a start.

References & Resources
John Dewan deserves a lot of credit for being open to constructive feedback. In this thread at the Book Blog, we discussed several things that John could do to improve Zone Rating. He took a couple of suggestions to heart and sent THT upgraded Zone Ratings for 2006. The 2007 Zone Ratings reflect those changes as well. If you use the historical Zone Rating stats, remember that the 2004 and 2005 Zone Ratings were calculated differently.

Print Friendly
 Share on Facebook0Tweet about this on Twitter0Share on Google+0Share on Reddit0Email this to someone
« Previous: Piazza… Again
Next: In search of the next Jack Cust (Part 2) »

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *