Ted Williams and perception
by Josh WeinstockAugust 03, 2011
A first ballot Hall of Fame player, Ted Williams was a generational hitter. Should you have the pleasure of looking at his player page on Fangraphs, you will notice an extensive list of absurd numbers. Consider his 1941 season, which is often noted for being the last time a player hit .400 in a full season. He amassed a .406/.553/.735 slash line, good for a .565 wOBA and a 221 wRC+. He walked 147 times, and struck out just 27 times. Combining his incredible offensive contributions with defense and position, he totaled 11.9 WAR in that season alone.
And that wasn't even his best year.
According to wRC+, his 1957 season was better, and according to WAR, his 1942 and 1946 seasons were better. Had he not fought in World War II, he may have increased his career WAR total by as much 33(!) wins, assuming an 11-win true talent level (probably generous). He finished with only 139.8 WAR, and if we add in his estimated missed value that figure increases to 172.8, which would be second all time. He was a pure hitter in every sense of the word, and was both a student and teacher of the game. An expert on hitting if there ever was, he wrote a book on the subject, The Science of Hitting.
In his book he has an interesting graphic of his estimated batting average in each area of the strike zone, which you can see below:

*From page 39 of The Science of Hitting.
Ted Williams, maker of the first heat map. As much of an expert as he was, this is not a realistic graphic, as you will see later. It's also pretty hard to see the numbers so I have recreated the graphic, which you can see below. I have also made the following graphic from the catcher's perspective, as that is the perspective that is used today in PITCHf/x analysis. Black indicates areas where Williams hits the best, white indicates areas where he hit the worst.

As you can see, he was very confident about his abilities on pitches in the middle of the zone and on pitches up in the zone, and was very bearish on his ability to hit pitches low and away. The differential is extreme. In his best zones he hits .400, and in his worst zone he hits just .230, for a difference of .170. Here is the graphic for the average lefty in 2011:

Note that this is for batting average, not batting average on balls in play. The numbers are high because strikeouts are not included.
Just as in Williams' graphic, batters do perform best on pitches that are in the middle of the plate. However, the differential between the best area (.340) and the worst area (.250) is .090, much less than Williams' estimation for his own strike zone. Of course it's possible that Williams' graphic is 100 percent accurate; there are no data to refute his claims. However, given how lefties perform today, it seems that his perception of his performance is not accurate. He is likely overestimating his ability in the middle of the zone and underestimating his ability at the edges of the zone, especially down and away.
To demonstrate this point, here is a graphic that shows the difference between Williams' estimated batting average and the average 2011 lefty's batting average.

Blue indicates locations where Williams performed better than average, red indicates areas where he performed worse, and white indicates areas where performed the same as the average 2011 lefty.
I have made no adjustments to account for differing run environments. But that's kind of the point. No matter what run environment, Ted Williams should be above average in all areas of the strike zone. The fact that this is not the case is a failure of our perception of baseball performance. Announcers love to harp on about how pitchers "can't leave the ball up" or "can't throw that down the middle of the plate." While there is some truth to these statements, in reality performance is much closer to the sobering doctrine of DIPS. The effect of pitch location on batting average, and especially batting average on balls in play, is really quite small.
It may be hard to grasp the randomness in baseball, but it is present in all areas of the sport and typically in quantities that we underestimate; Williams' graphic just serves to underscore this point. Despite this ever apparent truth we often strive to find trends, something easy to hold onto and to explain away the mystery that randomness brings. I know that I am guilty of this trend-searching when there is often is none to be found.
References and Resources
*The Science of Hitting
*PITCHf/x data from MLBAM via Darrel Zimmerman's pbp2 database and scripts by Joseph Adler/Mike Fast/Darrel Zimmerman.
*Fangraphs
You can read more of Josh's work at FanGraphs, Beyond the Box Score, and itsaboutthemoney.net. Josh welcomes discussion through email and twitter. You can reach him at josh82093 at gmail dot com and on Twitter @J__Stock (two underscores).







 
Good stuff.
The reason that Williams is showing a wider difference than the “average” is that the average well… averages out the differences. If you have one guy who prefers high, and another guy who prefers low, then the average would cancel those out!
What you should do is look at individual hitters, and see what their ranges are. And report those.