Slot machines are pure luck: you put your coin in, you pull the lever, and you take your chances. Repeat as often as you like or until you get a free drink. The longer you play, the more likely you are to end up with about the average outcome (which for slots is a negative amount—the house always wins). This is a version of the law of large numbers.

Now that the season is more than a quarter over, lots of batters have been playing their version of a slot machine for a while. Every time a batter puts a ball in play, he pulls a lever on the fielding slot machine. Sometimes he gets lucky and it is a hit and sometimes he gets unlucky and it is an out. The well known statistic Batting Average on Balls in Play (BABIP) tracks the average number of hits on balls in play.

Equally well known is that players’ skills have little impact on their BABIP; once the batter puts the ball in play (home runs don’t count), whether or not the ball goes for a hit has little to do with the name on the back of the batter’s jersey.

In this article, I’m going to do three things: I’m going to equate the luck on balls in play to a version of a coin flip, I’ll then simulate some of these coin flips and show that it looks a lot like the outcomes that batters have thus far in the season, and, then lastly, we will see that players can still be pretty lucky after only a quarter of a season. What are the practical implications? After a quarter of a season, you should still be skeptical (though not necessarily incredulous) towards players’ performances.

As we’ll see, the slot machine that a player plays when he puts the ball in play doesn’t have to be complex. In fact, let’s just suppose that this machine is a simple weighted coin flip. Instead of a 50-50 chance of heads and tails, let’s suppose the coin is 30-70. So 30 percent of the time the coin comes up heads and the player gets a hit, 70 percent of the time it goes for an out.

A player’s total number of hits and his BABIP after, say, 200 balls in play are random (just like the total number of times heads comes up after 200 coin flips is random). In fact, the distribution of hits and BABIPs (a distribution is sort of like the percentage of time we can expect to observe, say, 78 hits on 200 balls in play) is given by the binomial distribution. It is pretty easy to use a computer to simulate outcomes from a binomial distribution and compare it to the data we have so far from the season.

What I’ve done: I’ve taken each batter with at least 100 at-bats (243 batters). I’ve computed the number of hits in play (hits – home runs) for these batters and their BABIP. For each batter, I’ve then calculated what their batting average could look like if each at-bat was simulated and the outcome determined by a binomial random variable with the same average success rate (30.3 percent).

The graph below shows the number of hits we get from the data (blue) and from the simulation (red). Not bad (if you’re really curious, the two distributions are considered statistically identical according to a Komologorov-Smirnov Test). We can smooth things out and compute a distribution for each—that’s the next figure. The third graph is the same smoothed distribution, only this time for actual and simulated BABIPs. On this one, the match is even better.

What can we see from these graphs? The average number of balls in play for each batter is fairly high: 127. As far as statistics is concerned, 127 is a lot of coin flips. You might have read or heard other fantasy commentators say something like “Now that we’re in June, we don’t have to worry as much about small sample sizes.” While that is still literally true, the third graph shows that there is still a lot of variation left in the data. In fact, if you look at the CDF (cumulative density function), you can see that as of June 1, fifteen percent of players still have a BABIP below .250 even though their expected BABIP is .303. That is, even though the coin they are flipping should come up heads 30.3 percent of the time, they’ve gotten unlucky routinely and have only gotten heads less than 25 percent of the time.

My final graph shows what happens if we simulate 500 balls, or roughly four times the number of balls in play. The blue line is the same simulation from before, with on average 127 balls in play per batter. The green line simulates 243 batters with 500 balls in play using the binomial distribution. As we can see, the more balls in play we have, the more likely we are to get the median outcome and the less likely we are to get extreme outcomes.

In other words, in June, after 125 balls in play, a batter can still be lucky and have a high BABIP. In September, it should be far harder to have had a season of luck. So in June you must still be aware of the small sample.

Mark said...

What are your thoughts (or data) on

A. Hard Hit Balls

B. Ichiro (and others) ability to place the ball

Ichiro seems to hit the ball like a softball player. He tries to “hit it where they aren’t”

J.R. said...

Maybe Ichiro’s just been really, really lucky for a really, really long time. Or maybe the statement that “Equally well known is that players’ skills have little impact on their BABIP” is simply not true. I think I’ll go with the latter.

Toffer Peak said...

“Equally well known is that players’ skills have little impact on their BABIP; once the batter puts the ball in play (home runs don’t count), whether or not the ball goes for a hit has little to do with the name on the back of the batter’s jersey.”

That is well known as true for pitchers. However that is well known as false for batters.

John Burnson said...

Mike: A batter’s BABIP tends to regress toward his “true” BABIP (which is probably higher or lower than .300). A batter’s recent historic BABIP sheds light on his true BABIP, but because of noise, we can’t pin down the rate perfectly. Certainly, for a batter with Ichiro’s career AB, we have a better idea of his “true” BABIP than we do for less prolific batters.

Like you, I would not expect Ichiro’s BABIP to regress to .300, and no sound projection system would, either.

Mike said...

Thanks for the response, John.

What would be a reasonable AB cutoff for a player to approximate hit true BABIP? 1,000 ABs? 2,000 ABs?

John Burnson said...

Mike- Well, you can always obtain a “best guess”; what more AB gives is less dilution. With Marcels, for example, you add 1200 PA of league-average stats to the player’s weighted 3-year history. Even for someone with Ichiro’s massive playing time (2,200 raw PA from 2006-08; 8,800 weighted PA), league-average stats represent about 12% of his expected numbers. For someone with 200 PA/year, the proportion is 33%.

If you wanted no more than 20% dilution, then you’d need 400 PA/year (more or less). Etc.

Mike said...

I thought it was the case that all batters tend to regress to their own historic BABIPs as the season goeas along.

For example, we wouldnt expect Ichiro’s BABIP to regress from .366 to the league average (.300), because it is in line with his career BABIP of .357 (over 5600 career ABs).

Is that right?

Jeff said...

If you interpret this exercise as a way of showing that a batter’s BABIP does not tend to stabilize at its expected value very quickly, then I guess it has merit (assuming a binomial distribution is most appropriate, which is doubtful). A batter’s expected BABIP is likely best modeled using the player’s historical BABIP and deviations from his historical LD%, GB/FB, etc . . .