November 21, 2009
Order NowThe Hardball Times Baseball Annual 2010 is now in development and will ship in mid November! This year's book will feature articles by THT's staff as well as Bill James, Tom Tango and Craig Wright. If you use this link to purchase the Annual, you will be in the first group to receive it and you'll be supporting THT. ![]()
Rich Barbieri
John Barten Brian Borawski Craig Brown Evan Brunell David Gassko Jonathan Hale Brandon Isleib Chris Jaffe Max Marchi Bruce Markusen Harry Pavlidis Jeff Sackmann Dave Studeman Steve Treder Bryan Tsao Tuck! Dan Turkenkopf Colin Wyers Geoff Young John Brattain And here's the full roster.
Or you can search by:
Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets. Find premium Chicago Cubs tickets and other Chicago tickets at JustGreatTickets.com. Chicago Cubs Tickets Chicago Tickets ![]() All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License. |
Park effects and batted ball typesby Harry PavlidisSeptember 01, 2009 One of the best things about Gameday is the ability to download the data and do your own research and analysis. The possibilities are almost limitless, and Hardball Times readers will be familiar with the many uses, including pitching, hitting and fielding analysis. One of my favorite uses is to create a fielding independent measure of pitch quality. By now, you've probably seen Graham MacAree's tRA at Fangraphs or StatCorner. It's based on batted ball types, from ground ball to fly ball. It's a great tool to use, among many other quality measures of pitching, including xFIP. I like to use something similar to tRA, which I've been calling rv100E. That refers to "expected" run value per 100 pitches thrown, as in runs allowed, or saved, above average. Linear weights are used, based on hit type (single to home run) or out, as well as balls and strikes. The run expectancies are count adjusted, allowing for measurement of a particular pitch, or even a particular location—or both. Wait a second. I said rv100E is based on batted ball type, but the linear weights are based on hit type. Hit types are probabilistic, and are distributed based on batted ball type. A table will explain it better. Each batted ball type has a particular range of outcomes, and various frequencies. Outs, zero, one or more, can occur, too, even when a batter reaches safely. Those also are counted. Here are the 2009 batted ball type to even types, along with the linear weight (LW) associated with each hit type. Values are for 2009 only.
This is a nice start, but the data provided by Gameday, as far as batted ball type, are entered by the stringers in the press box—free-lancers hired for this purpose. And not all of them classify hits the same way. Dealing with park effects in layersConsider this problem. A line drive in Petco Park may be worth more than one in Wrigley Field. Or, more likely, it may have a different range of outputs. I can envision more line drive home runs in Wrigley, but fewer triples. At the same time, I'd expect a different range of outcomes with fly balls. More outs and doubles in Petco, more home runs in Wrigley. In that event, I need to apply different weights based on the park. Or do I? Do I really care if a pitcher gave up a line drive in Citi Field rather than Safeco? Either that pitch was hit hard, or it wasn't, and I want a park neutral value assigned. So, super duper, I don't need to worry about park effects. I treat all fly balls as equals, assign the league average home run rate (indirectly shown above) and move on. Or do I? Don't forget about the stringers. There's a "park effect" I care about. Let me show you what I mean. Here are the ratios for fly balls to line drives, by park, in 2009.
So, do the Mariners hit a lot of line drives, or do the stringers like to tag hits as line drives? Or should we blame their pitchers? One way to tease out the team itself from the stringers is to apply the park correction methodology and find the "park effect" on batted ball classification. Stringer effectIn reality, it isn't just line drives and flies we have to worry about. On a base hit, when does a liner become a grounder? How likely is a home run to be a line drive or a fly ball? This table keeps line drives and fly balls separate from their home run counterparts, while the above did not.
Dizzy yet? I am. If a number above is less than one, it appears the stringer is less likely than average to classify a batted ball as such. With a caveat for the home runs, there's a real park effect mixed in. Next stepsNow that I've crunched some numbers, there a few things left to do. First, expose this to public scrutiny to flush out issues with my methodology. At the same time, crowd source the application of this information. Based on a given park, and a stringer's classification, how would you distribute the linear weights for hits and outs? In other words, how should rv100E work? References and Resources Gameday data from MLBAM Linear weights calculated using Tom Tango's tool All math errors and other brain cramps by the author, but he'll blame the editor Harry Pavlidis admits he has a baseball problem. He also writes for Beyond the Boxscore, Out of the Ivy and his own blog, Cubs f/x. Feedback, questions and comments are appreciated - harrypav@gmail.com
Dave said...
Here is an easy way to find out…watch the game live. Of course none of this could ever be settled until you standardize what a line drive is, what a fly ball is, what a pop-up is and what a grounder is. And is that possible? There is always going to be subjectivity in that. Posted 09/02 at 04:00 PM
Page 1 of 1
Commenting is not available in this weblog entry.
Do you have a general question or comment for one of THT's writers? Send it in to our weekly mailbag We also welcome unsolicited op-ed pieces of approximately 500 words for consideration. We reserve the right to edit for length, clarity and consistency of style. Please include your whole name and location to be considered. If you have a comment about this specific article, please email the writer. Next Article: Baseball reporting, enthusiast style>> <<Previous Article: THT Awards | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
What we’re really after here is whether or not the ball is being hit differently in these parks or if it’s simply being scored differently. I think this could be answered with Hit F/X pretty easily, although I don’t know if the data we have from the conference is large enough to make any definate conclusions for out-of-sample data.