February 10, 2012

Now Available for 2012


THT Essentials:

Now Available



The Hardball Times Baseball Annual 2012, an annual "must buy" for all baseball fans, is now shipping. Read this article to learn more about it.
Fangraphs Player Search:

THT's latest book


Third Base: The Crossroads is THT's new e-book, available for $3.99 from the Kindle store. The good news is that anyone can read a Kindle book, even on a PC. So enjoy the best from THT in a new format.

And here's the full roster.



Dish TV Packages options for all televised baseball games.



Or you can search by:

Sports Tickets

Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets.
Championship Tickets






Creative Commons License
All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.

Part of the USA Today Sports Media Group

Batted balls and cheese

by Harry Pavlidis
September 15, 2009

Stringers are in every major league park, and most levels of minor league ball, too. They manually record various aspects of the game as it progresses. If you're watching an MLB.com Gameday feed, you're seeing a combination of PITCHf/x data (speed, location, pitch type, etc.) and stringer observation (where the ball went, how it got there, who fielded it, etc.). There's a level of detail that's not often discussed—or present—in the Gameday information, that could provide assistance in evaluating pitches, and the pitchers who throw them.

Two weeks ago, I looked at the variation, or bias, shown by stringers when classifying batted balls. I take interest in such things since I'd like to calculate pitch-by-pitch run values based on the type of batted ball, not the outcome. In other words, I'm interested in line drives and fly balls, not outs and hits directly. While Gameday consistently provides classifications for line drives, fly balls, pop-ups and grounders, it often provides an nice little descriptor—soft or sharp. If you look closely, you'll find that it is possible, according to some stringers, to bunt the ball sharply, or even hit a sharp pop-up.

Refresher on batted balls


Since clicking the link to an old article (two full weeks!) is taxing, here's a breakdown of batted ball types and value and likelihood of various outcomes. (Outcomes being hits and outs, values being Linear Weights.)


 Home RunSingleDoubleTripleOut
Line Drive.022.524.174.015.224
Ground Ball.000.219.018.001.693
Fly Ball.119.057.082.013.597
Pop Up.000.013.014.000.975
LW1.468.489.7681.052-.289



Contact types


Now let's layer on the cheddar cheese. Soft or sharp, or none of the above. Home runs are never sharp nor soft, and any play that has an error results from a normal batted ball. Allegedly. I give home runs their own contact tag, errors (and bunts) I'm usually ignoring.























eventcontact#
Buntsoft6
Buntnormal2,550
Buntsharp1
Errornormal1,203
Fly ballsoft1,109
Fly ballnormal27,867
Fly ballsharp165
Fly ball Home runhomer3,924
Ground ballsoft752
Ground ballnormal46,718
Ground ballsharp1,209
Line drivesoft1,644
Line drivenormal18,787
Line drivesharp812
Line drive Home runhomer471
Pop-upsoft147
Pop-upnormal8,475
Pop-upsharp1



Not every park (i.e., stringer) tags batted balls at the same frequency. The deeper we go, the more we need HITf/x.





































ParkTag Freq.
sln.0957
bal.0815
flo.0766
nya.0741
was.0732
nyn.0684
bos.0671
phi.0670
kca.0635
cin.0629
atl.0625
mil.0576
chn.0574
tor.0542
sfn.0520
tba.0514
cha.0448
cle.0442
col.0395
oak.0391
sea.0388
tex.0371
min.0364
ana.0324
det.0293
pit.0274
ari.0206
hou.0198
sdn.0196
lan.0159



AT&T is smack on the average (.0524). The difference between Busch III and Chavez Ravine is six-fold. That's a problem, but I'll forge ahead.

What's a sharp liner worth to ya?


Breaking down the batted balls by contact type (and ignoring home runs), here are the event probabilities by batted ball type.








Line Drive#SingleDoubleTripleOut(s)
all21,243.537.178.015.211
normal18,787.527.188.016.213
sharp812.413.209.018.249
soft1,644.707.052.001.167


Line drives are the most likely to be tagged—nearly 12 percent. The sharp line drive yields more outs than the other types. It also gets fewer singles and more extra base hits. The soft line drive is turned into fewer outs, more singles and far fewer extra base hits. Intuitively, beyond the sharp liners being turned into more outs. I can speculate about the human factors involved, but I'll leave that for the comments.








Pop Up#SingleDoubleTripleOut(s)
all8,623.013.014.000.975
normal8,475.011.014.000.977
sharp1.000.000.0001.000
soft147.122.034.000.844


I suppose the soft pops are the bloops over the infield. Less than 2 percent of pop-ups are tagged, so not much to see here.








Ground Ball#SingleDoubleTripleOut(s)
all48,679.220.018.001.644
normal46,718.198.016.001.666
sharp1,209.775.123.008.065
soft752.697.003.000.198


More than 9 percent of grounders are tagged. Not surprisingly, the sharp grounders have good outcomes—so good, they're the best of the lot. Home runs not included, of course. A soft grounder is a good thing, too. This is the only type of the four that has more sharps than softs.








Fly Ball#SingleDoubleTripleOut(s)
all29,141.064.092.015.622
normal27,867.038.092.015.642
sharp165.042.333.103.352
soft1,109.707.069.002.162


Fly balls are tagged as often as grounders, but lean heavily toward soft over sharp when tagged. Ground balls are tagged more on the sharp side, but the majority isn't overwhelming. Sharp fly balls are, essentially, extra base hits. Soft fly ball outcomes are very similar to the same contact outcomes for both line drives and ground balls. I wonder if a soft fly ball and a soft line drive are actually the same thing.

Conclusion


We really need HITf/x. Well, we do have some data: 15,000 batted balls from April. Next week I'll wrap up this series by comparing HITf/x data to various stringer tags—batted ball type and contact.



References and Resources
Batted ball classifications from MLBAM's Gameday, data from 2009 MLB regular season through Sept. 13

Harry Pavlidis admits he has a baseball problem. He is a member of Complete Game Consulting and has his own blog, Cubs f/x. Feedback, questions and comments are appreciated - Email harrypav@gmail.com and Twitter @harrypav

Comments

Brian Cartwright said...

...and the problem is even more extreme in the minor league. My Gameday data is 2006-2009, but shows the same patterns at the major league level.

Posted 09/15  at  11:41 AM
Stevenell said...

Wow, these are even more biased than the normal classifications.  Obviously, A stringer wants to point out that a guy hit a “sharp” line drive when he makes an out, but doesn’t worry about it as much if he gets a hit. 

Same thing with “soft” ground balls.  they want to make sure t is known that it was a swinging bunt, so they make sure to tag it.  If the person was thrown out, they might not think to tag it as such.

At least those are my theories.  Looking forward to the hitf/x.

Posted 09/15  at  12:24 PM
Colin Wyers said...

I think that over time the data quality issues with Hit F/X will prove more malleable than the data quality issues with human stringer data, although I could be wrong. And once you actually track the ball along the entire flight path (which is part of the overall DRE they were showing us in SF) the issues with Hit F/X become a moot point.

But yes, distance, vector and hang time will tell you pretty much everything you want to know. I know you at BIS are starting to track that, and I believe MGL is working on a project along those lines, but then you have to go from someone recording the data to actual analysis of the data.

Posted 09/15  at  04:55 PM
Harry Pavlidis said...

I agree with Colin. The human factor is far less tractable than the problems with HITf/x

Posted 09/15  at  08:00 PM
Alan Nathan said...

The problems with hitf/x that various of you refer to are indeed tractable.  A couple of months ago, I started work on a technique to correct the hitf/x data for the fact that the ball is tracked over a region that does not include the contact point, then extrapolated to the contact point assuming constant velocity.  The velocity is not constant, but the the change in velocity (i.e., the acceleration) can be estimated and the data corrected.  When this happens, then the reported data will more accurately reflect the velocity of the ball (magnitude and direction) at the impact point.

Unfortunately, as with many projects I begin, I have been sidetracked with other things so I have not finished up this project yet, but will do so in the next month or so.  Too many projects…too little time!

Posted 09/15  at  11:51 PM
Harry Pavlidis said...

jedlovec3 - no, I did not, but I should. Maybe that will go into the next follow-up.

Alan - you’re retired, allegedly, so get to work on this! You have the autonomy

Posted 09/16  at  02:18 PM
fjm(anuel) said...

I echo Stevenell’s sentiments.  Namely, that there may be an inherent bias in the classification of balls that are turned into outs, so that many “sharp” line drives that are hits are unreported as sharp.

Posted 09/16  at  07:20 PM
Page 1 of 1
Commenting is not available in this weblog entry.




The best online source for major league baseball tickets is Ticket City.

     Next Article:  THT Daily: Tigers in trouble?>> <<Previous Article:  Brooklyn cheers for Beltran, and other assorted enthusiasms