November 21, 2009

Player Search:

Order Now


The Hardball Times Baseball Annual 2010 is now in development and will ship in mid November! This year's book will feature articles by THT's staff as well as Bill James, Tom Tango and Craig Wright. If you use this link to purchase the Annual, you will be in the first group to receive it and you'll be supporting THT.


And here's the full roster.



Or you can search by:

Sports Tickets

Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets.
Find premium Chicago Cubs tickets and other Chicago tickets at JustGreatTickets.com.
Chicago Cubs Tickets
Chicago Tickets
Championship Tickets



Creative Commons License
All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.

Park effects and batted ball types

by Harry Pavlidis
September 01, 2009

One of the best things about Gameday is the ability to download the data and do your own research and analysis. The possibilities are almost limitless, and Hardball Times readers will be familiar with the many uses, including pitching, hitting and fielding analysis. One of my favorite uses is to create a fielding independent measure of pitch quality.

By now, you've probably seen Graham MacAree's tRA at Fangraphs or StatCorner. It's based on batted ball types, from ground ball to fly ball. It's a great tool to use, among many other quality measures of pitching, including xFIP.

I like to use something similar to tRA, which I've been calling rv100E. That refers to "expected" run value per 100 pitches thrown, as in runs allowed, or saved, above average. Linear weights are used, based on hit type (single to home run) or out, as well as balls and strikes. The run expectancies are count adjusted, allowing for measurement of a particular pitch, or even a particular location—or both.

Wait a second. I said rv100E is based on batted ball type, but the linear weights are based on hit type. Hit types are probabilistic, and are distributed based on batted ball type.

A table will explain it better. Each batted ball type has a particular range of outcomes, and various frequencies. Outs, zero, one or more, can occur, too, even when a batter reaches safely. Those also are counted. Here are the 2009 batted ball type to even types, along with the linear weight (LW) associated with each hit type. Values are for 2009 only.









 Home RunSingleDoubleTripleOut
Line Drive.022.524.174.015.224
Ground Ball.000.219.018.001.693
Fly Ball.119.057.082.013.597
Pop Up.000.013.014.000.975
LW1.468.489.7681.052-.289



This is a nice start, but the data provided by Gameday, as far as batted ball type, are entered by the stringers in the press box—free-lancers hired for this purpose. And not all of them classify hits the same way.

Dealing with park effects in layers


Consider this problem. A line drive in Petco Park may be worth more than one in Wrigley Field. Or, more likely, it may have a different range of outputs. I can envision more line drive home runs in Wrigley, but fewer triples. At the same time, I'd expect a different range of outcomes with fly balls. More outs and doubles in Petco, more home runs in Wrigley. In that event, I need to apply different weights based on the park.

Or do I? Do I really care if a pitcher gave up a line drive in Citi Field rather than Safeco? Either that pitch was hit hard, or it wasn't, and I want a park neutral value assigned. So, super duper, I don't need to worry about park effects. I treat all fly balls as equals, assign the league average home run rate (indirectly shown above) and move on.

Or do I? Don't forget about the stringers. There's a "park effect" I care about. Let me show you what I mean. Here are the ratios for fly balls to line drives, by park, in 2009.


































homeFB:LD
ana2.37
ari1.78
atl2.03
bal1.62
bos2.08
cha1.81
chn1.90
cin1.31
cle1.47
col1.25
det1.20
flo1.28
hou1.99
kca1.65
lan1.63
mil1.70
min2.10
nya1.31
nyn1.29
oak1.36
phi1.52
pit1.68
sdn1.52
sea1.25
sfn1.50
sln1.14
tba1.16
tex1.28
tor1.51
was1.24



So, do the Mariners hit a lot of line drives, or do the stringers like to tag hits as line drives? Or should we blame their pitchers?

One way to tease out the team itself from the stringers is to apply the park correction methodology and find the "park effect" on batted ball classification.

Stringer effect


In reality, it isn't just line drives and flies we have to worry about. On a base hit, when does a liner become a grounder? How likely is a home run to be a line drive or a fly ball?

This table keeps line drives and fly balls separate from their home run counterparts, while the above did not.


































parkGBLDLDHROFFBFBHRIFFB
ana0.980.750.231.211.201.02
ari1.010.880.421.091.120.93
atl0.980.830.251.190.871.00
bal1.001.000.471.031.230.87
bos0.970.860.001.121.071.10
cha0.960.910.561.091.341.07
chn0.990.920.431.061.220.99
cin0.981.161.000.931.140.93
cle1.070.970.520.970.690.99
col1.011.180.960.920.930.80
det0.981.124.020.920.741.10
flo1.011.142.210.900.890.99
hou1.010.850.421.071.101.05
kca1.040.970.221.060.840.76
lan1.040.860.661.060.871.03
mil0.990.921.521.061.001.04
min1.000.810.301.121.131.08
nya1.041.052.060.871.230.96
nyn0.971.000.990.981.231.18
oak1.001.101.240.920.981.03
phi1.021.060.780.931.170.94
pit1.010.950.480.991.151.04
sdn1.001.050.471.000.791.00
sea0.991.201.680.890.841.03
sfn0.990.991.131.000.921.14
sln1.081.131.760.840.660.87
tba0.951.192.440.880.781.26
tex0.961.261.450.951.020.78
tor1.020.931.240.981.191.03
was0.961.081.111.000.821.13


Dizzy yet? I am. If a number above is less than one, it appears the stringer is less likely than average to classify a batted ball as such. With a caveat for the home runs, there's a real park effect mixed in.

Next steps


Now that I've crunched some numbers, there a few things left to do. First, expose this to public scrutiny to flush out issues with my methodology. At the same time, crowd source the application of this information. Based on a given park, and a stringer's classification, how would you distribute the linear weights for hits and outs? In other words, how should rv100E work?

References and Resources
Gameday data from MLBAM
Linear weights calculated using Tom Tango's tool
All math errors and other brain cramps by the author, but he'll blame the editor

Harry Pavlidis admits he has a baseball problem. He also writes for Beyond the Boxscore, Out of the Ivy and his own blog, Cubs f/x. Feedback, questions and comments are appreciated - harrypav@gmail.com


Colin Wyers said...

What we’re really after here is whether or not the ball is being hit differently in these parks or if it’s simply being scored differently. I think this could be answered with Hit F/X pretty easily, although I don’t know if the data we have from the conference is large enough to make any definate conclusions for out-of-sample data.

Posted 09/01  at  02:31 PM
Dave said...

Here is an easy way to find out…watch the game live.

Of course none of this could ever be settled until you standardize what a line drive is, what a fly ball is, what a pop-up is and what a grounder is.  And is that possible?  There is always going to be subjectivity in that.

Posted 09/02  at  04:00 PM
Page 1 of 1 Commenting is not available in this weblog entry.

Do you have a general question or comment for one of THT's writers? Send it in to our weekly mailbag We also welcome unsolicited op-ed pieces of approximately 500 words for consideration. We reserve the right to edit for length, clarity and consistency of style. Please include your whole name and location to be considered. If you have a comment about this specific article, please email the writer.



The best online source for major league baseball tickets is Ticket City.

     Next Article:  Baseball reporting, enthusiast style>> <<Previous Article:  THT Awards