May 26, 2013

THT Essentials:
Fangraphs Player Search:


And here's the full roster.

Now available


You can now purchase the Hardball Times Baseball Annual 2013, with 300 pages of great content. It's also available on Amazon and Kindle. Read more about it here.

THT's latest e-book


Third Base: The Crossroads is THT's new e-book, available for $3.99 from the Kindle store. The good news is that anyone can read a Kindle book, even on a PC. So enjoy the best from THT in a new format.

Most Recent Comments





Get your very own THT merchandise from our CafePress store. We've got baseball caps, t-shirts, coffee mugs and even wall clocks with the classy THT logo prominently displayed. Also, check out the THT Bookstore. Please support your favorite baseball site by purchasing something today.



Or you can search by:


Creative Commons License
All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.
Roll mouse over date for entries
THT Live Calendar
May 2013
S M T W T F S



1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31

Wednesday, July 28, 2010

A very important article on fielding

Posted by Mike Fast
Baseball Prospectus has published an article by Colin Wyers today that may be one of the most important pieces written about fielding measurement in the last decade. The full piece is available only to BP subscribers, but let me briefly recap some of the topics Colin covers.

Colin reiterates the point that uncertainty in fielding measurements is something that can be tackled with bigger sample sizes, i.e., more season of data. Bias, on the other hand, is persistent. It does not decrease with larger sample sizes of fielding data. He mentions two types of bias: that related to park/scorer and that related to the fielder's range.

He then outlines a clever method for using data like putouts and assists in order to develop a fielding metric for infielders that should be much less subject to those two sources of bias than our current advanced metrics like Ultimate Zone Rating (UZR), Plus/Minus, and TotalZone. His metric very likely has greater uncertainty than the advanced fielding metrics that use ball-in-play data to determine which fielder had the best chance to field a batted ball. However, at some point, larger sample sizes should decrease the effect of the uncertainty, such that the reduction in bias using Colin's method will actually produce more accurate measures of fielding. Is Colin's method better after two seasons? Three seasons? Five seasons? Because we don't yet know the size of the park-scorer bias or range bias, we don't know exactly at what point that occurs.

Colin gives some fielding numbers from his system for shortstops, and with them, margins of error! That in itself is a very important advancement. He also shows that the advanced fielding metrics appear to compress the measure of Ozzie Smith's fielding value by about 25% over his career.

As I mentioned in the comments to Colin's article:
Colin, as I mentioned on Twitter, can you use these numbers to estimate the magnitude of range bias for various advanced fielding systems (and at various positions)? Over a large sample of players, the park-scorer bias should become much less important.

If the ~70 run difference for Ozzie Smith is due to range bias, and 1 play = 0.8 runs, and Ozzie played about the equivalent of 17 seasons, then 70 / 0.8 / 17 = about 5 runs per season due to range bias.

If we apply the same method to a large group of players, we ought to be able to estimate the range bias.
Colin showed that the margin of error in his system for a full season of fielding by a shortstop was around 20 runs. Since random errors add quadratically, that means that the margin of error for three seasons of shortstop data would be around 35 runs, or 12 runs per season.

If we guess that advanced metrics can cut this uncertainty in half, that puts them at six runs of uncertainty per season on three seasons worth of data, plus whatever bias they may have. At what sample size does that bias become bigger than the improvement in the uncertainty from using subjective ball-in-play data? It varies according to the player, for one thing, but my crude guesses suggest that maybe for anything in the neighborhood of more than two to four seasons, Colin's method could be superior.

Not to go too far down that road, because there's a lot of work to be done yet, but hopefully that shows why I am excited about what Colin has published today.



Mike Fast is a Royals fan who enjoys investigating baseball questions using data of many sorts. He is a member of Complete Game Consulting. He welcomes comments via e-mail.


Comments

Mike Fast said...

Tom Tango has posted his thoughts about Colin’s article at the Book blog:
http://www.insidethebook.com/ee/index.php/site/comments/reducing_bias_in_fielding_metrics/#comments

Posted 07/28  at  12:56 PM
Page 1 of 1

Leave a comment:

Commenting is not available in this weblog entry.