Was Neifi Perez a better hitter than Barry Bonds? It seems an unlikely proposition, but is it impossible? That has been the topic of discussion in an interesting (and very long) recent thread at Baseball Think Factory. In the thread, Tom Tango, a respected statistical researcher and a consultant to the Seattle Mariners, argues that “[Barry] Bonds 100 percent performed better than Neifi [Perez]. We are 99.9999 percent sure that he was in fact better than Neifi.” In other words, Tango is arguing that even knowing the paths of Bonds’ and Perez’s careers, we cannot say with total certainty that Bonds’ “true talent” was greater than Perez’s.

Most of the posters on the thread disagree. This post is fairly representative: “I don’t think you can go around talking about a player’s ‘true talent’ as if it really exists…Consider how absurd this sounds when applied to other areas of human endeavor. Would you say that there is only 99.99(…)% chance that Mozart had more ‘true musical talent’ than, say, William Hung?” So where does the truth lie? To answer that, we must first understand what we mean by “true talent.”

True talent is an ephemeral term, but it can perhaps be best defined as our probabilistic expectations of a player’s output at a given point in time, **given** that we know everything there is to know about that player—which we never do. In other words, true talent is only something we can estimate. With a lot of data we can do it fairly precisely—after all if a player has hit a home run in five percent of his plate appearances each of the past few years, he will likely continue to hit a home run in roughly five percent of his plate appearances if nothing changes—but without knowing **everything**—how the player’s muscles are feeling, what he had for breakfast, whether or he has some song stuck in his head—we cannot **precisely** guess at his true talent.

Moreover, based on all those factors—and of course many others—a player’s true talent level changes from moment-to-moment. Ichiro may have a 30 percent chance of getting a hit in one at-bat, but if his jock strap starts to itch, perhaps that goes down to 29 percent the next. On the other hand, if someone in the dugout makes a funny joke that puts Ichiro in a good mood, his true talent could go up to 31 percent so long as that good mood lasts.

Because we can never know all these minutiae, we can only attempt to estimate a player’s true talent. Now with Ichiro, that happens to be a relatively easy task: As of this writing, Ichiro has had 6,358 plate appearances in the major leagues and gotten a hit in 1,953, or 30.7 percent. If Ichiro’s average true talent over his career is equal to his true talent today—an assumption that cannot safely be made for most players, but one which I am fairly comfortable making for Ichiro, at least in the interest of making our lives a bit simpler for the time being—that would mean that our best estimate of Ichiro’s odds of getting a hit the next time he comes to plate would be 30.7 percent.

Actually, that is not exactly true, and this is where the second point of contention and confusion comes into play. The issue is that even if we are sure that Ichiro’s true talent today is equal to his average true talent throughout his major league career, even 6,358 plate appearances is still just a **sample** of Ichiro’s ability. True, it is a very large sample, but a sample nonetheless.

Let’s say, for example, that I asked you to predict the odds that a batter will get a hit in his next plate appearance, but the only information you had was about his last plate appearance, in which he got a hit. What odds would you estimate? Well, knowing nothing about the guy other than that he is a major league hitter and that he got a hit his last time at-bat, you would be best-off hedging and guessing that he was around average. The average player gets a hit in 23.5 percent of his plate appearances, so your best bet would be to guess that his odds are 23.5. Actually, though one plate appearance is very little information, it is some, so the best guess you could give me would be closer to 23.7 percent.

Okay, now what if I told you that the player had gathered 40 hits in his past 100 plate appearances—what would you guess his odds of getting a hit were now? You certainly wouldn’t say 23.5 or 23.7 percent—after all, a guy who can gather 40 hits in 100 times at-bat is probably a pretty good hitter. Sure, he could be a poor hitter who got lucky—every year, some hapless batter goes on a huge hot streak before returning back to earth—or he could be an average hitter who’s had a little luck, but more likely than not, he handles a bat better than most major leaguers. I don’t know exactly what you would guess, but statistically you’d be best-off guessing something like 27.2 percent. Sounds reasonable, right?

So how is 6,358 different from one plate appearance or 100 plate appearances? Well, it’s obviously a much bigger number, but still it is just a number, and any number—no matter how big—is just a sample. Now, if all I told you about a given hitter was that he had gathered 1,953 hits in his past 6,358 plate appearance, what would be your best guess of our expectations for his next time at-bat? Well, if you guessed 30.7 percent, you certainly wouldn’t be far off, but you would be incorrect. The right answer is more like 30.3 percent.

How can that be? The important thing to remember is that statistics are just a sampling of an athlete’s true ability; actually, they’re less than that since that true ability constantly varies. But even if we forget about that variation, no number of plate appearances will tell us exactly how good that player is. At a trillion plate appearances, we might have to go out many, many decimal points before the player’s sample numbers and our best estimate of his true talent diverge, but eventually they would.

The reason for this is simple. Again, remember that all statistics know is what they show. If all we know about a player is that he (1) plays in the major leagues, and (2) got a hit in his last plate appearance, we have very little to distinguish him from every other major leaguer. The odds are roughly equal of his being above average or below, though it is ever-so-slightly more likely that he is above. That is why our best estimate of his true talent is 23.7 percent, versus an average of 23.5—that is essentially the weighted average of all his potential talent levels. What are the odds he’s Matt Stairs? What are the odds he’s a high school scrub? What are the odds he’s Ted Williams? Mario Mendoza? Take those odds, multiply them by the corresponding talent level, and add them all up, and you’ll get 23.7.

If all we know about the hitter is that he (1) plays in the major leagues, and (2) has gathered 40 hits in his last 100 plate appearances, then we still can’t be too sure about his talent level. Sure, most players who go on such tears are well above-average—but not all by any stretch. The odds that the player in question is Ted Williams (or a player of equivalent talent) are definitely higher than the odds that he is Mario Mendoza, but we can’t know for sure. Even Mendoza had some hot stretches of hitting in his career. So again, if we weight the odds of that player being at each given true talent level, we come to the conclusion that our best guess at his true odds of getting a hit are 27.2 percent.

So now we come to the player who has 1,953 hits in 6,358 plate appearances. It is true that even if all we know is this one piece of information (and of course that he is a hitter in the major leagues), we already know a lot about him. But we do not know all. We know that his true talent is almost certainly somewhere near 30.7 percent—the sample is too large for us not to be. His talent might be 30.6 percent or 30.8 percent, but it’s somewhere around there. However, there does exist a small possibility that even such a large sample has not given us the proper impression of the player’s talent. Now, this happens to be a very small possibility but it could be that this is the most talented hitter of all time and it could be that he is of merely average talent, and has simply gotten exceptionally lucky. The odds may be 1-in-a-million, but they are not zero.

Statistically, the next question to ask is, which is more likely? Well, we know that there are many average hitters out there—thousands in the history of the MLB—but by definition, only one man can be the greatest hitter of all-time (or, for that matter, the worst). A simpler way of putting this is that a major league player is much more likely to be about average than he is to be at an extreme, whether that extreme is greatness or mediocrity. (Note that technically this is not quite true, since there are obviously many more mediocre players than there are great hitters or even average ones. However, if you weight these things by playing time, playing time is distributed fairly normally, with about average players getting the most in aggregate while extreme players get much less—extremely bad players because major league teams try to avoid playing them, and extremely good ones because there are so few.)

Since our player has demonstrated well above-average performance, this means is that he is slightly more likely to be a worse hitter than his performance has thus far shown him to be than he is to be better—simply because there are more such hitters. If he had shown a below-average track record, conversely, our best guess would be that he was slightly better than he had played, even over 6,358 plate appearances.

Now, what this does not mean is that Ichiro’s 1,953 major league hits somehow do not all count, just because our best guess—only knowing his numbers—is that if Ichiro re-played all the games he’s played in his career all over again, he’d end up with 1,930 hits. The 1,953 hits are real, and what we estimate his true talent is does not affect that. Moreover, our estimate of his true talent has thus far been confined to only two facts: (1) He is a major league baseball player, and (2) In his career, Ichiro has gathered 1,953 hits and 6,358 plate appearances. If we knew how fast Ichiro was, how good his bat control was, and any other pertinent facts, we could get a better estimate of his true talent—if the average player with Ichiro’s speed and bat control gets a hit in 30.7 percent of his plate appearances, our best guess about Ichiro’s talent would also be 30.7. Statistics know only what you tell them.

Now, statistically we still have to quantify Ichiro’s pertinent traits, and we still have to calculate what hit probability those traits correspond to. To say that you’ve observed Ichiro being a good hitter so you know his true talent is 30.7 percent is not enough—the logic is circular and the thinking minimally rigorous. But I do want to make this point clear: The more information we have, the better and more exactly we can estimate how good a player is. Even that one player with one hit in one at-bat—if the scouts tell us he’s a superstar, his odds of getting a hit in the next at-bat are much better than 23.7 percent.

But in the end, Tango’s point is correct. We are 99.9 percent sure that Barry Bonds was a more talented hitter than Neifi Perez. We could, in fact, carry out the nine many more decimal places than that and the statement would still be accurate. But, if all we know is that Barry Bonds hit 762 home runs in his career whereas Perez hit 64, we cannot state with 100 percent certainty that Bonds was the more talented hitter. We know more than that—we know that Bonds had great pitch recognition while Perez did not; we know that Bonds was a big, strong guy while Perez was not; we know that Bonds could generate incredible power while Perez could not—but the numbers know only themselves. So unless we can quantify all of these intangibles, and moreover show that a Perez-type player could **never** (i.e., not 1-in-a-million, not even 1-in-a trillion, but the odds have to be exactly zero) be better than a Bonds-type, we cannot claim with 100 percent certainty that Bonds is the more talented hitter.

Now, because we can be 99.9 percent sure, this specific argument is really just pedantic, but it does shed useful light on the science of estimating a player’s talent, which is what every fan, general manager, and fantasy baseball player is always working hard to do.

Mike said...

David, great piece, very well worded and explained. My only question is: how come you chose plate appearances instead of at-bats for these percentages?

David Gassko said...

Hey Mike,

Glad you enjoyed it. Plate appearances are a bit easier in that they let us discuss all possible outcomes, instead of ignoring the probability of getting a walk or hit-by-pitch or sacrifice. It doesn’t really matter in terms of this explanation.

Moe said...

David,

Very interesting article, but I believe you’re right and wrong at the same time:

The way you approach the problem seems to be with Bayesian statistics. You draw a player from the distribution of all major league players and start with the ‘belief’ that he is average (which of course is totally reasonable). Then you use the information you gather over time to update your belief about the player.

However, if your initial belief were different (for example, that Ichiro is not like anything that has every played in the majors and hence you have no initial prior or that he is the 2nd coming of Ted Williams) you would get a different conclusion. The importance of the initial belief decreases over time, as you nicely describe above, but it never goes away.

Hence, the answer to the question on how likely it is that Ichiro will get a hit depends on what your prior belief is. Saying the probability of a hit is 30.7% would be the correct answer if I had no prior or the only information I was given is his passed success and nothing about major league player in general. And if I believe that Ichiro is not actual human but a robot designed to get a hit every time (did you see his walk-off hit against Downs on Tuesday?), the correct answer would be above 30.7% and not below—because of the prior.

David Gassko said...

Moe, that is correct. I considered going into the fact that what I am talking about here is Bayes but decided that would be too much for an already long article.

OrDidIJustBlowYourMind said...

Yep. It’s both fairly mind blowing and obvious at the same time. Although I think you could factor in traits like pitch recognition and strength as qualititative factors if you weren’t sure how to quantify them. You’d have to watch out for multicollinearity, though.

Top it off with the idea that all knowledge is only as good as our perception of it – that is, we don’t really know if what we see and think is real – and you’ll be kind of confused for a few hours.

Daniel said...

Thanks David; this essentially lays out my own understanding of true talent (and, with the help of Colin’s article on the related regression to the mean), but it puts it an easily understandable way, which is enormously helpful (both for assuring myself of the understanding and for possibly explaining it to interested others).

I did have one question: I assume that there is a “formula”/“equation” you are using to determine exactly what our best guess of “true talent” is based on sample size and successful trials within that sample (your three examples, all given population mean success rate of .235: 1 for 1 = .237, 40 for 100 = .272 , Ichiro’d 1953 for 6568= .303) . What is the process for figuring out these “best guesses”? I assume it is easily adaptable to any average succcess rate I want to use?

My statistics knowledge is relatively weak (I was never interested until recent baseball geekdom) and I could just be forgetting a relatively standard type of question/problem from, say, Stats 101, but if you could lay it out, it would be much appreciated.

David Gassko said...

Hey Daniel,

The formula I used to regress is, n/(n + x) where n is the sample size and x is a constant that fits the data. For Hits/PA, x is around 340, so the formula is PA/(PA + 340).

For the guy that is 1/1, that means we weight his Hits/PA, 1/(1 + 340) = 0.3% and the league average 99.7%. For the guy that is 40/100, his stats are weighted 100/(100 + 340)= 22.7%, and the league average is weighted 77.3%. Since his Hits/PA is .400 and the league average is .235, our best guess as to his true talent is, .227*.400 + .773*.235 = .272.

Technically, this method is a shortcut to the correct “Bayesian” way of doing it, but there is practically no difference between the two and this method is much simpler.

Daniel said...

Great, thanks very much. I’ll look into Bayesian statistics soon enough to see what you’re talking about with the “correct” method.

Is there a database/list somewhere of appropriate constants for different statistics (AVG, SLG, OBP, etc etc)? Relatedly, has anyone come up with semi-reliable means to use for different populations? (for instance, and keeping with Hits/PA: the population of hitting prospects who are projected/scouted across the board to hit for ++ average would certainly require a different mean than, say, guys unanimously thought of as all-glove-no-bat, or at least I would think so anyways….)

Mike said...

Colin,

You provided a link from Pizza Cutter about intra-class correlation and about how many PA are necessary for R to stabilize at .7 for each hitting stat. Have you – or Pizza Cutter – ever looked into pitching stats as well? If I’ve run across them before, I cannot find them now.

Colin Wyers said...

Daniel –

What you take is the correlation and number of PAs, and do:

R = PA/(PA+C)

Where R is the correlation, PA is the PAs and C is the constant.

A little algebra gives us:

C = PA*(R-1) / R

I’d use the correlations at 300 PA from here, myself.

Mitchel Lichtman said...

“Saying the probability of a hit is 30.7% would be the correct answer if I had no prior or the only information I was given is his passed success and nothing about major league player in general.”

I used to think that as well, but I discovered that that is not true. Even with no prior, there is still a regression when dealing with a binomial probability. Read the appendix by Andy Dolphin (an extremely smart pure statistician) in The Book.

David, nice articulation of exactly what I said on The Book blog. Of course, had you printed this on BTF or some such site like that, you would be lambasted for being wrong (which you are not of course) and you would be lambasted for arguments that you are not making, like, “Yeah, but it makes no difference so stop being an idiot…” And then you would get the obligatory metaphysical (and completely irrelevant to your thesis) arguments from people who think they are a lot smarter than they actually are.

The signal/noise ratio in the comments section of THT is quite high…

Colin Wyers said...

Mike – Yes.

Daniel said...

Great, thanks for the link, Colin. Some really useful-looking stuff in there. Actually, I’m getting great links all over the place over these past couple of days of discussion of “true talent” and “RttM”.

Matthew Cornwell said...

Colin – how many PA does it take for R to stabilize at .7 in regards to HR/FB? 6,000? 8,000? 10,000? Has anybody done this?