Batted balls and home runs

Consider two home runs

On June 18, 2010, Torii Hunter nailed a home run off Carlos Silva at Wrigley Field. (You can watch the video at MLB’s site). I mean he nailed it. The ball screamed off his bat and cleared the left-centerfield fence at Wrigley at about the third row. According to Hit Tracker, the ball would have traveled 414 feet if it had hit flat ground and its apex was just 47 feet above the ground for a height/distance ratio of 0.11. The elevation angle of the ball off the bat was just 16.7 degrees, the lowest of all home runs tracked by Hit Tracker last year (not including inside-the-park jobs). I timed it, and the ball took just 3.6 seconds to land.

According to Baseball Info Solutions, it was a fly ball.

On May 15, 2010, Bill Hall smashed a home run off Eddie Bonine (here’s the video at MLB). I don’t know if “smashed” implies “not quite as hard as nailed,” but it’s supposed to. If you compare the two videos, you can tell that Hall’s homer had a little more loft than Hunter’s. Hit Tracker estimates that it would have traveled 398 feet and hit a max elevation of 64 feet for a ratio of 0.16. The elevation angle off the bat was estimated at 22.8 degrees. I timed it at 4.1 seconds before hitting the wall in the back of the left field bullpen.

According to Baseball Info Solutions, it was a line drive.

The issue

I’ve played with batted ball statistics for a while now, just about as long as the Hardball Times has been around. Batted ball stats, as compiled by Baseball Info Solutions, are just plain cool. Knowing how often batters hit line drives, or how often pitchers force infield flies, adds a new understanding to the game and results in metrics like xFIP and xBABIP and xWHIP—the “x stats”—as well as advanced fielding stats.

But lately, batted ball stats have been taking some hits (get it?). Colin Wyers (formerly of THT and now researching everything at Baseball Prospectus), has identified several reasons why data recorders in different parks might interpret the angle of a batted ball differently. Colin has only gotten more skeptical over time; I think he recently referred to line drives as “lie drives” somewhere (Twitter?).

It’s not that we don’t generally know what a line drive is. It’s that the definition between different batted ball types is gray, and when you start looking at small samples of batted ball stats—for individual batters, say, or pitchers—there may be some significant differences in how specific balls are classified. You may not know what you think you know.

It is hard to define a line drive. I looked in some old Project Scoresheet documentation and found this definition:

A popup rises higher than it travels. At the other extreme, a line drive travels farther horizontally than its peak altitude. And a fly ball is midway between the two. Nobody expects you to actually measure the things; just use your judgment.

I don’t think anyone follows those confusing definitions, but they’re interesting for historical purposes. A few years ago, I interviewed the original director of Project Scoresheet (and now the primary owner of Baseball Info Solutions) John Dewan. He talked about the introduction of a new type of batted ball, the fliner:

A fliner is pretty obvious once I describe it. There are line drives and there are fly balls. When you watch a baseball game, you know a line drive when you see it; same thing for a fly ball. But some balls are hard to characterize. Was that a fly ball or was that hit hard enough to be a liner? Did that have too much of an arc to be a line drive? Or was that a fly?

So those balls that you want to question are going to be called “fliners” this year. The bottom line is that line drives get to the fielders more quickly than fliners, which get to the fielders more quickly than fly balls. That will be an important distinction to make from an evaluation standpoint.

You won’t find them on Fangraphs or Baseball Reference, but BIS reports the number of fliners in addition to fly balls, grounders and line drives. As an extra service, it subdivides fliners into fliner/line drives and fliner/fly balls, so Fangraphs can still report just line drives and flies for your virtual perusal. Fliners are kept in the background.

I guess the thinking behind a fliner is obvious, but it still begs the question for me: How do we identify a line drive? A fly ball? A fliner? Is it up to the judgment of the watcher? If so, can we tighten the definition by looking at some data?

The idea

I was thinking about this the other day when I also started thinking about Greg Rybarczyk’s fabulous Hit Tracker site, where Greg and friends watch every single home run on video and use advanced engineering algorithms to measure each one’s physical characteristics. And I wondered: Can we combine these two sources of information to find out what makes a batted ball a specific type of batted ball?

Hit Tracker tracks three key physical characteristics that are important for our purposes (definitions are from Hit Tracker):
{exp:list_maker}Elev. Angle – the angle above horizontal at which the ball left the bat, in degrees. Typically between 25 and 45 degrees for home runs.
Apex – the highest point reached by the ball in flight above field level, in feet.
True Dist. (True Distance, a.k.a. Actual Distance) – If the home run flew uninterrupted all the way back to field level, the actual distance the ball traveled from home plate, in feet. If the ball’s flight was interrupted before returning all the way down to field level (as is usually the case), the estimated distance the ball would have traveled if its flight had continued uninterrupted all the way down to field level. {/exp:list_maker}So I downloaded all 2010 home runs from the Hit Tracker website and combined them with the BIS batted ball data based on the date, batter, pitcher and inning combination. I excluded all inside-the-park home runs and a few other oddities, corrected about 20-30 obvious data errors and wound up with a database of 4,562 home runs, their physical characteristics and the batted ball types BIS had assigned to them.

A Hardball Times Update
Goodbye for now.

Let’s remember, however, that the Hit Tracker estimates are just that … estimates. Most of us have a tendency to trust pure numbers more than batted ball categories because the specificity of the numbers seems so compelling. But we should avoid that trap. Hit Tracker is based on people watching video, using stopwatches and then applying some advanced math. There’s some subjectivity and room for error there. Plus, just because something uses math I don’t understand doesn’t make it right.

So we’re comparing subjective batted ball categories (that almost certainly include some data errors, just because errors happen) with somewhat subjective estimates (ditto about the errors). It’s not exactly HITf/x.

But heck, somewhat subjective data has never stopped me before! Let’s see if we can quantify what a line drive is.

A deductive stroll through the data

First of all, let’s lay out how often each type of batted ball was hit last year and how often it was hit for a home run (all BIS data):

Type                         Tot     HR    Pct
Line drives                9,213      0     0%
Fliner/line drives        14,560     87     1%
Fliner/outfield flies     15,195  1,341     9%
Outfield flies            29,319  3,182    11%

Pure line drives were never home runs, about one percent of fliner/line drives were. The home run percentage was around 10 percent for outfield flies, whether of the fliner variety (nine percent) or not (11 percent). So we won’t find any specific info about line drives in our home run database. Note that I didn’t include infield flies in the data. Infield flies have specific parameters and aren’t hit for home runs.

So there is one rule we’ve already uncovered: Home runs are never classified as pure line drives and hardly ever classified as fliner/line drives. Line drives are batted balls that don’t leave the ballpark.

Now let’s look at the database of 4,562 home runs, including the Hit Tracker data. The following table groups them by batted ball type, then also includes the apex of the ball (its “peak altitude”), its true distance (as estimated by Hit Tracker) and the ratio of its apex to its distance.

Type               Tot    Apex    Dist   Ratio
Fliner liner        83      57     374    0.15
Fliner fly        1328      72     392    0.18
Fly               3151      95     400    0.24
Grand total       4562      87     397    0.22

First, you can see that the old Project Scoresheet guidelines don’t apply at all. Every home run traveled much farther horizontally than its peak altitude, yet none of them were pure line drives and only a few were fliner/line drives.

Secondly, you can see the general logic behind batted ball categories. The higher a ball is hit in the air, the more likely it is to be classified as a fly ball. And the higher the ratio between a ball’s apex and its distance (in other words, the more “loft” it has), the more likely it is to be a fly ball. Makes a lot of sense.

Here’s a different look at the data. I grouped all home runs by the angle they took off the bat, from 16.7 degrees to 44.8 degrees. Here’s how they were grouped into batted ball types:

  Angle     FlinerLD  FlinerF    Fly     Total
   <20            8      43       11       62
  20-22          24     152       63      239
  22-24          26     290      181      497
  24-26          14     322      399      735
  26-28           7     261      585      853
  28-30           2     132      609      743
  30-32           2      75      541      618
  32-34                  43      357      400
  34-36                   6      214      220
   >36                    4      191      195
Grand Tot        83   1,328    3,151    4,562

Once again the overall pattern makes sense—the higher the angle, the more likely the ball was categorized a pure fly. In fact, it looks as though we can generalize a second rule: Any ball with an angle off the bat of about 30 degrees or more will probably be labeled a fly ball.

But there is a lot of overlap of batted ball types between 20 and 30 degrees. For instance, of all the balls that came off the bat at a 22-to-24 degree angle, 26 were fliner/liners, 290 were fliner/flies and 181 were pure flies. What gives?

Let’s dig further into the data. Here is a table of all the balls that came off the bat at a 22-to-24 degree angle, broken out by the ratio of the ball’s apex vs. its distance. Remember, the higher the ratio, the more likely the ball should be a fly ball.

Ratio       FlinerLD  FlinerF   Fly   Total
0.13-0.14                4       1       5
0.14-0.15       15      34       5      54
0.15-0.16        9      97      19     125
0.16-0.17        1      90      50     141
0.17-0.18               41      64     105
0.18-0.19        1      19      36      56
0.19-0.20                5       5      10
0.20-0.21                        1       1
Tot             26     290     181     497

The pattern still holds that balls with more loft are more likely to be fly balls, but there is still a lot of overlap within ratios. So let’s dig one step further, looking just at one particular ratio and breaking out by distance. Here are the 141 home runs that came off the bat at a 22-to-24 degree angle AND had a height/distance ratio between 0.16 and 0.17:

Distance   FlinerLD   FlinerF     Fly   Total
356-365                    1               1
366-375                    6               6
376-385                    9       4      13
386-395                   14       8      22
396-405          1        20      14      35
406-415                   24       5      29
416-425                    7      12      19
426-435                    6       3       9
436-445                            3       3
446-455                    2               2
456-465                            1       1
476-485                    1               1
Tot              1        90      50     141

Well, the trend holds, but just barely. For instance, there were 35 home runs that came off the bat at a 22-to-24 degree angle had a height/distance ratio between 0.16 and 0.17 AND traveled between 396 and 405 feet. That’s a pretty narrow definition (as estimated by Hit Tracker). Yet one of those homers was labeled a fliner/line drive, 20 were fliner/flies and 14 were pure outfield flies.

Let’s move to our eyeballs and look at the video of some of the 35 home runs mentioned above. We’ve already seen the fliner/liner. It was Bill Hall’s smash on May 15. I timed it at 4.14 seconds from bat to the fence in left field. Hit Tracker puts the true distance at 398 feet.

A useful comparison is Prince Fielder’s home run off Ervin Santana on June 15. If you watch it, you can see that it took longer to land (I have it at 4.73 seconds) and it was hit to left center a little beyond the 387 sign (Hit Tracker puts the true distance at 396 feet). BIS classified it as a fly.

I score this one in BIS’ favor and I question Hit Tracker’s figures. The fact that Hall’s home run hit the fence cut short its air time a bit, but Fielder’s ball was still in the air longer. Plus Hall’s ball just looked like a line drive off the bat. The video may have been deceiving, but that’s what my eyes tell me.

In fact, I looked at many of the home runs in this sample, and I could see a legitimate difference between many of these home runs, particularly in terms of hang time. A few comparisons were instructive.

Evan Longoria hit a home run to left center off Ryan Perry on July 28. I watched the video and timed the ball’s hang time at 4.2 seconds. Hit Tracker estimated its true distance at 396 feet, its elevation angle at 23.4 degrees, just slightly higher than Hall’s. BIS labeled it a pure fly. To my eye, the two homers were about equivalent.

If you watch the video, however, you can see that Hall’s home run was off a pitch that was a bit above the belt. Longoria’s was a bit below the belt. I don’t believe Hit Tracker takes pitch location into account when estimating angle off the bat, but I’m going to guess that BIS factors it into its batted ball classification (intentionally or not). So this is another factor to take into account: the pitch location and the level of the swing at impact.

Here’s one last one. Shane Victorino hit a dramatic grand slam off Johan Santana on May 2. If you watch it on video, you can see that the pitch location and swing are similar to Hall’s as well as the direction of the ball and hang time. This one was labeled a fliner/fly. I guess you have to draw the line somewhere, but dang if I can tell whether the line should have been drawn here.

The conclusion

So now that I’ve dragged you through all that, what can we say? Well, we can say that line drives don’t leave the ball park. That most balls that leave the bat at an elevation angle over 30 percent will probably be labeled a fly ball. I’m going to go out on a limb here and suggest that anything with an elevation angle under, say, 15 percent, will be a line drive (or ground ball, obviously). Between 15 and 30 percent is gray and requires extra info.

Some of that extra info appears to be the distance traveled by the ball (the farther it goes, the more likely it will be a fly), the location of the pitch and the level of the swing. Location of the batted ball (as in left, left/center, center, etc.) may even be a factor. Perhaps Torii Hunter’s home run (back in the beginning of the article) was labeled a fliner/fly because it was hit to center field?

And that’s about all I can say about line drives. Don’t be disappointed. It’s more than I used to know, and I certainly didn’t expect something definitive. I didn’t expect to assuage any of Colin’s concerns about batted ball classifications.

All of this analysis is really a precursor to the day that HITf/x changes the way we record and analyze batted balls. HITf/x will use calibrated cameras to specifically record the trajectory of each batted ball, where it lands and how long it takes to get there. I don’t know what language we will use to describe the data, but it will certainly include hang time and landing parameters. It might include things like apex and elevation angle. You can read more about it in the Hardball Times Annual.

Eventually, our “x” stats and fielding stats may not include any batted ball information at all.

References & Resources
Extra special fun fact: Home runs in the late innings were a couple of feet shorter, on average, than those in the early innings.

 Inning     Feet
   1       398.9
   2       397.2
   3       395.2
   4       398.0
   5       396.8
   6       398.1
   7       395.7
   8       395.6
   9       396.0


Dave Studeman was called a "national treasure" by Rob Neyer. Seriously. Follow his sporadic tweets @dastudes.
12 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Mike
12 years ago

Re: extra special fun fact: it’s colder in later innings, right?  Could that explain the shorter HRs?

Peter Jensen
12 years ago

Dave – Retrosheet/Gameday agrees with you on the Hunter and Hall HRs, classifying Hunter’s as a LD and Hall’s as a FB.  And just to clarify it is Field Fx that will track the fly ball’s full trajectory using 4 extra video cameras.  Hit Fx measures the initial trajectory and hit ball speed off the bat from the few extra frames of video captured by the two Pitch Fx cameras.  Nice article about the current uncertainty of the classifications. 

The initial angle doesn’t provide as much useful information as one might think, since the differing amounts of backspin will cause much different trajectories.  The ratio of distance divided by height is probably the best determination for classification.  Of course, if we had accurate data on distance, height and hang time there would be no reason to assign further arbitrary classifications.

Dave Studeman
12 years ago

Thanks Peter. I get all my Fx’s mixed up.

Good point about the angle off the bat, but that isn’t really germane to the Hit Tracker calculations.  They’re not really measuring angle off the bat, just a theoretical angle from home plate based on the distance and speed of the home run.

I look forward to Fieldf/x data, but I have a feeling that we’re going to have to use some sort of classification system to make the data digestible to baseball fans.  I wonder what that might look like?

Dan Novick
12 years ago

Great article, Dave. I thoroughly enjoyed it.

slideric
12 years ago

It will take a while but I will get most of it.  Hope you get the audience you deserve because this is great material.  Should get you a phbaseball..

Dave Studeman
12 years ago

A few comments:

@James M.: I honestly don’t know where to draw the line between liners and flies. Saying that virtually all home runs are fly balls doesn’t bother me too much, as long as whatever rule is applied consistent.

I don’t think that the ratio of height/distance can be the sole determinant here. Consider the ground ball that just barely lands within the infield.  That will have a ratio of 3/90 or something like that.  A very low ratio, but not a line drive. A ground ball.

Conversely, a really long home run might have a lower ratio than a line drive that stays in the park.  Should that be a line drive too?  It’s not clear to me.  Distance and/or power are factors.

That’s why I like “angle off the bat,” as a starting point, though, as Peter points out, that’s not really what HomeRun Tracker is measuring.

@John DiFool: My understanding is that the BIS video scouts do use stopwatches, at least some of the time.

@Brian: Yes, but Gameday doesn’t have fliners, either. Not quite apples to apples.

John DiFool
12 years ago

For a number of years I’ve never understood why scorers don’t have a stopwatch to measure the time of flight of balls hit in the air.  Doing that would eliminate most of the issues surrounding the liner/fly issue, starting with the ambiguity.

Brian Cartwright
12 years ago

Interesting that BIS has no line drive home runs, as they do exist in Gameday. This is their MLB data from 2005-2011

+——-+———+——+——-+———+——-+——-+
|level| ld   |ldhr|hr/ld| fb   | fbhr|hr/fb|
+——-+———+——+——-+———+——-+——-+
| mlb |153160|3168|0.021|230815|26501|0.115|
+——-+———+——+——-+———+——-+——-+

Brian Cartwright
12 years ago

Either distance divided by hang time (feet per second) or speed off bat times cosine of elevation angle (miles per hour) will give horizontal velocity, a measure of how much time a fielder has to get to the ball. Not applicable to over the fence HRs, but can be useful in analyzing fielding or batting (or maybe even pitching).

Nathaniel Dawson
12 years ago

(the farther it goes, the more likely it will be a fly)

I’m going to go out on a limb here and say that that’s because the longer a ball stays in the air, the longer gravity has a chance to work on it and push it’s trajectory downward. These balls aren’t getting classified just based on their first, say, 60 feet of flight in the air. The stringer is observing them the whole distance. A ball that has more hang time is going to have a more downward trajectory as it reaches the ground, making it appear at that point as more of a fly ball than a line drive, influencing how the stringer perceives it and decides how to label it.

James M.
12 years ago

Am I the only one who is troubled by the fact that HR’s are never classified as liners?  I’ve seen plenty of them that pass the eye test. 

Can we simplify this?  Distance / time = average velocity.  What do we get if we classify that way?

Alan Nathan
12 years ago

Just wanted to point out that FIELDf/x has yet to demonstrate the measurement of a fly ball trajectory.  It is not very easy to do.  TrackMan can measure the full trajectory. 

I also would like to emphasize a point made by Peter Jensen over at Tango’s blog.  Namely, Greg does not measure the apex or even the initial velocity (speed, angles) of the batted ball.  He infers them from his measurement of the landing point and hang time, with the aid of a model for the drag and Magnus forces as well as a model for the ball-bat collision that relates the spin of the batted ball to the initial velocity vector.  A more reliable way to determine the full trajectory is to combined Greg’s observations (landing point and hang time) with the HITf/x determination of the initial velocity vector.  That information alone essentially determines the full trajectory.  How do I know this?  Well, I did an experiment several years ago using the TrackMan system to track trajectories of baseballs projected from a pitching machine.  Then I used the initial velocity, landing point, and hang time (and no other information), along with my fitting technique, to determine the full trajectory, which I then compared with the actual TrackMan measurement.  The two trajectories agreed very well.  So, the point I would make is that the full FIELDf/x tracking of batted ball trajectories (a very difficult job) is not actually necessary, as long as we have the initial velocity, landing point, and hang time.