May 21, 2013

THT Essentials:
Fangraphs Player Search:


And here's the full roster.

Now available


You can now purchase the Hardball Times Baseball Annual 2013, with 300 pages of great content. It's also available on Amazon and Kindle. Read more about it here.



Or you can search by:

THT E-book


Third Base: The Crossroads is THT's e-book, available for $3.99 from the Kindle store. The good news is that anyone can read a Kindle book, even on a PC. So enjoy the best from THT in a new format.



Get your very own THT merchandise from our CafePress store. We've got baseball caps, t-shirts, coffee mugs and even wall clocks with the classy THT logo prominently displayed. Also, check out the THT Bookstore. Please support your favorite baseball site by purchasing something today.


Creative Commons License
All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.

A Short Digression into Log5

by Dan Fox
November 23, 2005

"Nellie was the toughest out for me. In 12 years I struck him out once, and I think the umpire blew the call." - Whitey Ford, New York Yankees pitcher, on Nellie Fox

In my previous article exploring the significance of batter/pitcher matchups, I used the log5 method in order to calculate the expected average for each matchup. Additional discussion on the results from that article can be found on my blog.

For those who understandably lacked the patience to wade through that rather long tome, Bill James published the method in the 1981 Baseball Abstract in order to analyze how well one team should play against another. That usage of the method has been comprehensively applied to matchups between teams in an article by Tom Tippett at Diamond Mind.

When applied to batter/pitcher matchups, the formula includes the hitter's average, the pitcher's average against, and the league average and derives what the hitter should hit against that pitcher. The entire formula is:

ExAvg = ((BAVG * PAVG) / LgAVG) / ((BAVG * PAVG) / LgAVG + ((1-BAVG)*(1-PAVG)/(1-LgAvg)))

I also mentioned that Dan Levitt had written an excellent article back in 1999 for SABR's By The Numbers newsletter that was reprinted at Baseball Think Factory.

In Levitt's article he takes a look at data from 1995 to see how well the formula predicts actual matchups. He broke the pitchers and hitters down into three groups (Good, Average, Poor) and then compared the actual results with the formula and found that the formula does a remarkably good job of predicting the result. For example, in the NL, average pitchers against average hitters would have been expected to hit .247, but they actually hit .251.

Since that article covered just one year I thought I'd run a different test of log5, given that I had gone to the trouble of computing over 30,000 matchups for the period 2003-2005 for the previous article.

For my study I broke the results of the matchups into six categories by batting average and then calculated the hitter's overall average over the three year period, the expected average using log5 and the actual average for the range. What I found confirms what Levitt concluded and that is that the log5 method works remarkably well. The results in table form follow:
 
    Range   Count Hit Avg   ExAvg  Actual    Diff   PctDiff
.000-.199     975   0.208   0.170   0.167   0.004      2%
.200-.249    7379   0.256   0.233   0.234  -0.001      1%
.250-.274    8502   0.268   0.262   0.267  -0.005      2%
.275-.299    7909   0.281   0.286   0.289  -0.003      1%
.300-.324    4175   0.292   0.310   0.312  -0.002      1%
.325-.454    1541   0.308   0.340   0.335   0.004      1%

In each category the differences amounted to less than five points of batting average, or just a 2% difference. The results can be shown graphically as well.

image

From the graph you can see that when plotted with the hitter's overall average over the three years, the actual and expected lines match up very well. Where it differs you'll notice that the log5 method over predicts hitter performance a bit at the extremes and under predicts it in the middle.

Dan is the author of the blog Dan Agonistes and welcomes your comments and suggestions via email.

Comments


Commenting is not available in this weblog entry.



     Next Article:  Business of Baseball Report>> <<Previous Article:  Stop the Madness! Or not.