Monday, June 07, 2010
BABIP by hit locationPosted by Ricky Zanker
This is a very simple question: Where is the best location on the field to get a base hit?
It is very difficult for a hitter to put the ball in play exactly where he wants it. But the hitter has a majority influence on where the ball will go. He will either pull it, go up the middle, or to the opposite field (or "going oppo" as the cool kids say it). The Gameday data provides the hit location (actually the location fielded by the fielder) of a batted ball and using Peter Jensen's translations, I can find the angle relative to home plate. And since I want to find out the likelihood of a base hit on a ball in play (home runs are always base hits), BABIP fits in just nicely.
Instead of calling it pull, middle, and opposite, I will stick with left field, center field, and right field even though I include balls fielded in the infield. So now here is a table with the BABIP to the three fields since 2008 split by batter handedness. I also took out what Gameday labels as "pop outs" since they are almost always outs.
So the ball fielded in the direction of centerfield is the location that is highly likely to be a base hit. Interestingly, a left-handed hitter pulling the ball into right field is considerably worse than going to center or the opposite field. This might make sense since most teams now shift towards right field with a notable left-handed pull hitter so there may be some base hits taken away there. However, this doesn't correspond to right-handed hitters who hit well pulling the ball to left field.
But something is not right. Baseball Reference also has hit location splits for the entire Majors by season. And looking at the past two seasons data, it seems that a batted ball hit up the middle is the least likely to be a base hit for both batter handedness. This is the exact opposite of what I found earlier using the Gameday data.
How can this be? I divided up the fields square even with 30 degrees to each. It is likely that Baseball Reference's data, which comes from Retrosheet, is divided up differently than what I did. From the Retrosheet website itself there is a visual chart that shows the codes that the Retrosheet stringers use to enter hit location data. There are about seven different zones relative to home plate. Which means that up the middle can size be either the lone center zone or the center zone and the two adjoining ones. Whatever it is, all three fields are not even.
It is possible that there is an error on my part since Peter Jensen's translations are only from 2005-2008. But it should't mean I would be getting angles 10 degrees off of what they should be. And my data is comes almost exactly the same for 2008 and 2009 for BABIP and the number hit to each field so there should haven't been any drastic changes in the factors. I also took out pop outs but that just gives me a slightly higher BABIP. This will have to be investigated further in the future.
Sticking with this season, I can plot a local regression for BABIP by the batted ball angle for the top and bottom BABIP hitters in the Majors for this season. First up the right handed hitters.
Don't pay too much attention to the MLB average line. Austin Jackson does get most of his hits to center and right which Fan Graphs BIS data aligns with also. The Retrosheet data however doesn't follow. But Jackson is expected to get some serious regression soon. Aaron Hill does have a higher down the right field line, but that is because he has three base hits out of ten balls in play at 20 degrees and greater.
Now for the left-handers.
Carlos Pena should not be hurting that much from the shift. And Justin Morneau follows the average line, except he is almost 20 percentage points higher to rightfield.
All four players have unique curves in the regression. Jackson does well to center and right feildExpect all four hitters to regress, with Jackson to regress the most although I wouldn't be surprised with it stays close to .400 since he has a history of high BABIPs in the minors.
This is a rough idea as I ignored so many aspects earlier including batted ball type, the frequency of the batted ball angle among others. I also have to do a little more research into finding out how Baseball Reference splits there hit location data and whether or not I made in errors in my method of producing batted ball angles.