# Exploring Batted Ball Run Values and Spray

Mike Napoli is good at hitting fly balls (via Keith Allison).

If you have read my articles on batted balls and the like, this will come as no surprise: I don’t care for line drive percentage (LD%). As such, I’ve spent a significant amount of time thinking of better ways to measure a player’s performance on batted balls. If you want a recap, in a previous article I made the argument that H/Batted Ball Type was a better measure of a player’s batted ball distribution and performance. But to bring you up to speed, consult the following section:

### Why no two Batted Balls look the same or should be treated as such

It’s no secret Line Drive, Ground Ball (GB), and Fly Ball (FB) classifications are faulty and have some major measurement errors. Until there is a standardization of the stringer data based on hit angle and velocity, we should be cautious with taking batted ball classifications at face value. The fine line between a FB and a LD is in the eye of the beholder, with many different eyes determining the classification among MLB’s 30 ballparks. So, the fact that LD%, GB%, and FB% don’t take into account classification biases by adjusting for park factors is concerning, being that a FB at one stadium can very well look different at another — despite looking the same on our spreadsheet.

Two, by using batted ball percentages we are assuming that all LD’s, FB’s, and GB’s are created equal. Binning each batted ball type into a percentage assumes that we believe they are created equal by not differentiating between a batted ball hit with authority and one blooped over an infielder’s head. While, yes, the public does not have HITf/x, and yes, we cannot differentiate batted balls by velocity—we can use proxies to help us separate batted balls into more accurate groups. Using Gameday data, it is possible to approximate the distance a ball was hit, the angle to the field it was hit, and (less accurately) which type of batted ball it was.

By using the proxy of distance and spray (pull, opposite, center), we see major differences in the sub-groups of each batted ball type. For instance, a pulled LD falls for a hit 8%-10% more than a opposite hit LD—there is a similar differentiation between a pulled and opposite field FB. Take a look at the following tables:

The tables below are pulled from Gameday data, which is known to have its own classification errors. For that reason, I removed all unreasonable batted balls from the data set (for instance balls recorded to be hit over 500 ft, etc.). When comparing the cleaned data set to the raw one, the numbers below were very much similar. Angles range from negative (left field), to 0 (dead center) to positive (right field). Fields vary but on average the foul line is somewhere in between ± 45 – 50, so Pull was defined as any ball hit < -20 degrees, and opposite was > 20 degrees. Of course for a righty hitting the ball at -25 degrees (left-field) is pull, and 25 degrees (right-field) is classified as an opposite field batted ball—whereas a lefty hitting the ball at -25 degrees would be hitting the ball the opposite way.

Line drive to outfield, not equaling home run, 2008-2014
 Spray BABIP runvalue n distance std(distance) std(angle) Center 0.796 0.395 72,097 255.43 46.00 10.62 Opposite 0.737 0.341 27,043 231.29 36.78 33.61 Pull 0.847 0.463 39,235 245.44 40.98 35.39
All balls hit to outfield, not equaling home run, 2008-2014
 Spray BABIP runvalue n distance std(distance) std(angle) Center 0.394 0.068 201,586 261.18 63.78 10.67 Opposite 0.331 0.003 82,846 235.12 49.97 33.65 Pull 0.563 0.245 72,828 242.99 64.46 34.52
All balls hit to an infielder, on ground, not bunts, 2008-2014
 Spray BABIP runvalue n distance std(distance) std(angle) Center 0.224 -0.110 189274 116.00 59.94 12.45 Opposite 0.374 0.014 35015 113.95 66.66 33.03 Pull 0.251 -0.085 137950 110.11 56.98 35.12

You can see that spray has an obvious effect on the rate at which certain batted ball types fall. For that reason, the field to which a ball was hit should not be ignored when analyzing a player’s batted ball profile. So when analyzing any aggregate of batted ball type, first we should likely stratify by pulled, opposite, and up-the-middle hits. Whether or not spray is a proxy for batted ball velocity, my guess is that it may have just as much to do with defensive positioning as quality contact. So in addition to spray, we have to look at individual run values of the events spawned from each batted ball type. Below, variance in run value of any batted ball type differs between each type:

Batted ball type and run values
 Batted Ball Type runvalue std(runvalue) count Line Drive 0.34 0.44 162,971 Fly Ball 0.05 0.61 219,819 Ground Ball -0.09 0.35 358,722

Imagine a line drive, then imagine a fly ball. The picture is simple, a fly ball is more likely to leave the park, and more susceptible to park conditions (wind, heat, positioning). But a fly ball can also look like a pop-up or “room service”. Hence a fly ball has the most variance in its run values of any of the three batted balls.

While, a line drive is also most likely to fall, it is mostly effected by luck (where it was hit, its speed off the bat), but its high run value supports common knowledge that it is the most desirable (not sustainable) outcome of the three batted ball types. Meanwhile, ground balls are mostly dependent on defensive positioning (shifts), surface, and batted ball speed. But if speed off the bat is a factor for all three, which it most likely is, and if in fact spray serves as a proxy for hit-velocity, then the chart below should show that a pulled batted ball should have the highest run value for its specific bin.

Batted ball type and spray, run values
 Batted Ball Type Spray runvalue std(runvalue) count(*) Fly ball Pull 0.44 0.79 38,640 Line Drive Pull 0.42 0.46 47,879 Line Drive Center 0.32 0.43 84,540 Line Drive Opposite 0.29 0.43 30,552 Fly ball Center 0.01 0.56 127,122 Ground ball Opposite 0.01 0.39 34,283 Ground ball Pull -0.09 0.35 137,222 Fly ball Opposite -0.11 0.43 54,057 Ground ball Center -0.12 0.33 187,s217

Like we expected, a pulled ball leads the way in run value for its batted ball bin, where the exception is ground balls—an opposite-hit ground ball is worth a tad bit more than a pulled one. My guess is that this has to do with shifts and/or the fact that most pulled ground balls are rolled over. Still fly balls have the most variance in run values, in front of line drives and ground balls.

So with obvious differences between batted balls inside their own bins, there has to be a better way of representing how no two batted balls look the same, and how they should not be treated as such. Intuitively, a home run is a home run, context neutral. Meanwhile, A LD is not a LD in any context of the word. A LD can be a single, a double, a triple, a home run, an out — all possibilities equipped with different run values. For this reason, expecting some regression of the amount a player hits line drives and expecting subsequent regression of BABIP and/or wOBA is not wrong, it’s just not quite right. Instead, we should use the empirical data we have on the actual outcome value of a player’s batted ball to supplement the simple rate that outcome occurs. With run values in the picture, I care less about how often a player is hitting a line drive and more about what he is doing with a line drive.

Like I’ve said before, I’ll take 15% LD rate from Giancarlo Stanton before I dare take a 45% LD rate from Juan Pierre. Expecting any regression from Ichiro Suzuki‘s line rate to negatively effect his performance overall is assuming he has a marginally better line drive run value, per line drive, than the average player (after adjusting for park). But can we make that assumption solely based on the rate at which he hits line drives? The answer is clearly no, and it calls for a more transparent measure of batted ball performance based on the actual run values resulting from a player’s batted ball distributions.

### Enter Weighted Batted Ball Runs Above Average

We want to create a measure of how valuable a player’s batted ball type is to their production. We can do this using run values for the events created by that batted ball type.

Dave Studeman has tirelessly researched and produced articles revolving around the topic of batted ball production. His work on the area, shows exactly how batted ball rates tend to diverge from their usefulness when we want to describe the actual run production of a player’s batted ball. So what I introduce today, is not news, it’s merely my interpretation of some of the great work from those who came before me.

As for an introduction let’s say for instance a player has 50 line drives, and in those 50 line drives he has created 21.05 runs from 45 singles and 5 outs. We want to adjust this player’s performance by adjusting it for what the average player would do in as many LD’s and then adjust that measure for park factors, or how much more frequent a LD was recorded in the parks where the line drives were hit compared to the relative frequency in all other parks. Since I am using the same data set as Bill Petti’s spray tool , I’ll use the same run values—where: “-.28 – outs, .5 – singles, .79 – doubles, 1.07 – triples, 1.41 – home runs”.  The first step is to find a player’s run value per ball in play (BIP). Let’s use fly balls as an example: 2013 Chris Davis had a run value of 0.46 per FB. Basically the value of a single for each fly ball hit, but it does need to be contextualized by the league average run value per FB — which hovers around 0.05-0.06 runs per fly ball. So Chris Davis’ RVfbaa (Run Value per Fly Ball Above Average) was around 0.41. Great. Now we need to account for park so that we can feel better about the possible classification errors and park effects on run values per batted ball type. This will be the last step, so using the following formula will yield wRVfb (Weighted Run Value of Fly Ball):

wRVfb = (RVfbaa – (PF_FB/100 – 1) * AverageRV/FB * Player FB) / (PF_FB/100) )

Or for 2013, Davis’ wRVfb was 0.41 given Camden Yards had a FB park factor of 100 (league average). Follow the same process for line drives and ground balls and you’re all set with values that will assess a player’s batted ball performance in terms of run values. Now we have measures of how valuable a player’s batted ball outcomes are, in addition to the raw probability they occur. Let’s take a look at some of the leaders and losers.

Best and Worst Fly Ball producers, 2008-2013, min 500 BIP
 Best wRVfb BIP Jeremy Hermida 0.34 643 Jim Thome 0.33 533 Mark Reynolds 0.25 1,451 Mike Napoli 0.24 1,484 Pedro Alvarez 0.22 1,034 Worst wRVfb BIP Omar Vizquel -0.20 873 Alberto Gonzalez -0.21 625 Jemile Weeks -0.22 662 Cesar Izturis -0.25 1,181 Emmanuel Burriss -0.25 573
Best and Worst Line Drive Producers, 2008-2013, min 500 BIP
 Name wRVld BIP Matt Carpenter 0.15 791 Mark Trumbo 0.14 1,120 Alejandro De Aza 0.13 965 Chris Johnson 0.12 901 Derrek Lee 0.12 886 Name wRVld BIP Ben Francisco -0.14 677 Justin Turner -0.15 639 Matt Diaz -0.15 641 Kevin Frandsen -0.17 530 Eugenio Velez -0.17 533
Best and Worst Ground Ball producers, 2008-2013, min 500 BIP
 Best wRVgb BIP Lorenzo Cain 0.10 562 Lastings Milledge 0.09 680 Andrew McCutchen 0.09 2,067 Austin Jackson 0.09 1,693 Mike Trout 0.08 905 Worst wRVgb BIP Craig Counsell -0.07 798 Eugenio Velez -0.07 533 J.P. Arencibia -0.07 784 Blake DeWitt -0.08 536 Jim Thome -0.08 533

Finally, some facts about the wRV metrics:

1. There is very little correlation between wRV to LD%, GB%, and FB% respectively. In fact, LD% had a 0 correlation between itself and wRVld. In other words, batted ball run values are pretty much independent of the rate at which they are hit.
2. ISO explains nearly 70% of the variation in wRVfb, so it is a pretty great proxy for the value of a player’s fly ball.
3. A simple regression of wRVfb, wRVld, wRVgb explains nearly 70% of wOBA, while adding BB% and K% explains accounts for around 35% of wOBA in year two.
4. Below are the year-to-year correlations, and the data can be found here:
wRV metrics, year-to-year correlations
 Metric R wRVfb 0.573 wRVgb 0.286 wRVld 0.262 wRVoverall 0.547

So they have similar variances to batted ball rate metrics, being that they seem to be subject to a lot of year-to-year variation—line drive and ground ball production is the hardest to maintain while fly balls remain relatively consistent. So what’s the culprit here? My guess is defensive positioning and shifts. My guess is that for pull hitters and home run hitters, these numbers are pretty consistent once a shift is found to limit their effectiveness, while more balanced hitters are subject to more random variation.

There is a lot more analysis to employ here, however. In my piece tomorrow, I create a shift breakeven point and a metric that isolates players who should be shifted based on their batted ball run values relative to their spray tendencies. In the future, I except to regress run values based on distance hit from the fielder, so that we can isolate for players who have overperformed due to faulty fielding position and could expect regression once better alignment is in place.

### References and Resources

• Thanks to Jeff Zimmerman for the Gameday data and distance/angle code. Also to Major League Baseball Advanced Media for publicly providing the Gameday data.
• Studeman, Dave. “Pictures of Batted Balls.” The Hardball Times. Jan. 5, 2006.
• “WRAA For Position Player WAR Explained.” Baseball-Reference.com. June 2, 2014.
41700
« Previous: Johnny Damon’s Self Serving Steroid Sermon
Next: Exploring the Shift Dynamic »

1. The Stranger said...

This is rather counterintuitive, since the ability to drive the ball to all fields is generally considered to be a good thing. I get that opposite-field balls aren’t as well-hit on average, and that you’re looking at spray tendencies as a proxy for quality of contact. But I’m not sure that’s a good proxy to use, since it seems like it would have an inherent bias towards pull hitters at the expense of guys who can use all fields effectively.

• tz said...

I think that’s one of the key issues Max is addressing in the last paragraph. Players who can use all fields are more shift-proof and would be less likely to suffer from an optimized defensive arrangement.

• Matthew Yaspan said...

I don’t think Max is attempting to evaluate hitters by whether they hit more balls to the pull-side or not, but to characterize what the average is to better understand the myriad of different values a batted ball can take on, rather than treating them all equally.

• Max Weinstein said...

The Stranger, you are absolutely right. Spray is beneficial to a player’s batted ball effectiveness and potential to do damage with gb’s/ld’s/fb’s, mostly because its harder to find an optimal defensive positioning. However, Weighted Batted Ball Runs contextualizes the batted ball performance overall, instead of stratifying by spray direction — although I have subset data on pull/opp/center Batted Ball Runs Above Average. But say, If a player usually hits 45% pull line drives, and he sees a decrease in the amount of pull line drives, we will see a decrease in his line drive production. So now, It is more effective to generalize, “hey, this player has seen a decrease in pulled line drives, at 30% pull versus 70% opposite, he will likely see a increase in line drive production down the road. At that point you know a player is losing a good portion of almost *certain* hits of his line drive— and we would expect some sort of positive regression, more so than if we observed an increase in opposite hit line drives. In the end, the overall usefulness of these statistics is it’s power in a regression of wOBA or determining when and who to shift.

2. tz said...

Max, I’m wondering if there’s any way to look at the average “UZR-against” for each of the bins. I’d love to know if there’s any significant correlation between the “UZR-against” and these weighted run values for any given player.

Presumably, since UZR doesn’t capture the “quality” of the balls hit into a particular zone, we might be able to adjust UZR on any given BIP to account for the hitter’s average weighted run value for that type of BIP (assuming, of course, that a meaningful correlation exists)

• Max Weinstein said...

I haven’t played with bins, I think Bill’s app does that nicely. I like what your saying though — and I’d have to assume there probably is a pretty good correlation there. I might do something with RF, just to keep things simple, to start.

• tz said...

If you can look into that, I’ll look into your Dominican birth certificate to see if you’re really a teenager

Cause if you are, you just might be the Mike Trout of the saber-verse.

Excellent stuff, as always!

3. Andy said...

Always welcome another analysis from you, Max.

I’m glad you mentioned the difficulty in classifying LD vs. FB, because I have long wondered about that. It’s obviously a convenient classification good for generally describing what happened, but for serious analysis needs to give way to more precise definitions.

I was at first surprised to see the big difference in BABIP and RV for FB (and to a lesser extent, LD) pull vs. center and opposite. I understand that pulled balls are likely to be hit harder, with more authority, but against that I would have thought the OF would be positioned expecting the pulled ball more likely. That is either not the case or doesn’t contribute much.

I suppose in the case of FBs that’s because most of the run value is from HRs (and technically they are not BIP)? Depending on just how FB vs. LD classification is made, I would think FBs to any field that don’t leave the park would have about an equal chance of being caught? So that most of the greater value of pulled FBs is from being much more likely to be HRs?

But since there is also a substantial spray difference for LDs, it does raise the question: if shifts work for the infield, as the greater value of opposite field GBs suggest, could they/should they be used more in the outfield? That is, should the OF positioning take into account that a pulled LD will be hit harder, and thus give the OFer less time to make a play on it?

The other surprise for me (being somewhat new to this subject) is the enormous variation in RV of FBs for different players. Some players obviously make their living off them, while for others a FB is clearly the one thing they don’t want.

• Max Weinstein said...

Andy, thanks for reading. I was surprised about the FB’s too — but I made sure to exclude home runs — meaning the discrepancy has to do with gameday classifications, lack of efficient outfield alignments, or both. This is something to explore, and I have work under way to do so. Yes you are right about the FB variation, imagine the difference between a Stanton fly ball in the gap, which could be a line drive, and then a Dee Gordon fly ball softly hit to the left fielder. Which leads me to think, there could be more error in the classification of player’s who hit the balls in the air the most, and with the most authority.

• Andy said...

So you did exclude HRs. I wasn’t clear on this. Then it does make the difference between pull and the other two directions remarkable. I agree with your last sentence. I’m thinking now that whether a ball is caught or not may influence the classification, i.e., if two balls are hit with the same trajectory, and in the same direction, but one is soft enough to give the OFer a chance to run it down, whereas the other is not, the former may be classified as a FB while the latter is a LD.

4. Matthew Yaspan said...

Great work, Max, looking forward to more.

5. reillocity said...

I really enjoyed this as I’ve been using league-specific tables very similar to your “BATTED BALL TYPE AND SPRAY, RUN VALUES” table to evaluate pitcher performance at all levels of affiliated domestic baseball for going on 2 years now.

When you switch over to evaluating batter performance via that sort of approach, the average run values of outfield flies to the 3 zones vary widely (unsurprisingly) with the batter’s power. The run values of line drives also vary with the batter’s power though not as dramatically. The run values of groundballs do not vary with the batter’s power.

Here is a bit of 2013 MLB data for AL non-pitcher batters. I split the batter sample up into 4 power quartiles using an ISO on batted balls stat that is park-adjusted. PQ1 is the highest batter power quartile (the most powerful 25% of them), while PQ4 is the lowest one (the least powerful 25% of them).

Event- Average run value for Event by Batter Power Quartile
OF Fly Pull- PQ1=+0.60 runs, PQ2=+0.47 runs, PQ3=+0.38 runs, PQ4=+0.24 runs
OF Fly Center- PQ1=+0.08 runs, PQ2=-0.04 runs, PQ3=-0.06 runs, PQ4=-0.10 runs
OF Fly Oppo- PQ1=+0.05 runs, PQ2=-0.08 runs, PQ3=-0.08 runs, PQ4=-0.15 runs
LD Pull- PQ1=+0.42 runs, PQ2=+0.39 runs, PQ3=+0.37 runs, PQ4=+0.30 runs
LD Center- PQ1=+0.33 runs, PQ2=+0.29 runs, PQ3=+0.30 runs, PQ4=+0.25 runs
LD Oppo- PQ1=+0.34 runs, PQ2=+0.27 runs, PQ3=+0.24 runs, PQ4=+0.18 runs
GB Pull- PQ1=-0.11 runs, PQ2=-0.11 runs, PQ3=-0.11 runs, PQ4=-0.10 runs
GB Center- PQ1=-0.02 runs, PQ2=+0.00 runs, PQ3=+0.01 runs, PQ4=-0.02 runs
GB Oppo- PQ1=-0.05 runs, PQ2=-0.03 runs, PQ3=-0.05 runs, PQ4=-0.07 runs

I imagine that segregating the batters by speed might yield some advantage for the faster over the slower on groundballs, though I’d expect the run value increases to be on the small side given that most of the beneficial events would stand to be singles.

• tz said...

Out of curiosity, how did you split the hitters into power quartiles?

This is very interesting. The speed factor would be great to incorporate for ground balls, though the meaningful split might be 3B side vs. up the middle vs. 1B side (length of throw being more critical than power of ball being hit for grounders). It might also be interesting to see how the OF and even LD values split by speed given the run value of stretching a single to a double etc.

The speed factor would probably require something like average of top 10% times running from home to first, since indicators based on SB rates, etc. might be driven by skills not necessarily correlated to speed.

• reillocity said...

As a first attempt at applying this approach to hitters, I simply used same-season ISO on nonbunt, nonfoulout batted balls in road parks to define each hitter’s power (in truth I’d prefer to use multiple prior seasons data rather than current season data). I then computed the mean and standard deviation on that stat and used those parameters to split the pool into 4 quartiles.

I share similar thoughts on the impact of speed on these event run values.