Man vs. computer
by David GasskoOctober 06, 2009
A few days before the start of the 2009 season, I wrote a column here titled, “29 players I think the THT projections got wrong.” The title is pretty self-explanatory, but let me quote the introduction to that column so that you know where I was going with it:
Each of the past three years, we’ve released projections for thousands of players, and each year, I have received tons of e-mails relating to specific players readers think we have over- or underrated. Frankly, I’m with the readers—our system is very good, but it is not perfect. Sometimes, I think I know more than it does, and today, I’ve decided to test that thought.
What follows, then, is a list of 15 hitters and 14 pitchers who I think will either over- or underperform their projections, with my reasoning explained. I formed this list without looking at other projection systems, since the idea here is to figure out if human intuition can beat a computer-based system, rather than trying to find areas where some other projection system outperforms THT. At the end of the season, I will check in to see if my hunches were correct, or if the computer knows best.
To be clear, I selected only hitters projected to have at least 500 major league plate appearances and pitchers projected to have at least 100 major league innings pitched; I wanted to avoid, as best as possible, players who won’t play much in the major leagues in 2009.
The comments I got on that column were mostly skeptical; Mitchel Lichtman, a former senior advisor to the Cardinals, for example, put it bluntly: “I am always skeptical of these, ‘I can beat a good forecast system just by looking at the forecasts,’” he wrote. Fair enough.
A commenter on Baseball Think Factory was of the same opinion: “I expect the outcome will be that DSG (those are my initials) can’t beat the computer.” Frankly, I felt the same way. Still, intuitively, the 29 projections I challenged looked wrong to me, and I figured it was worth it to find out if my gut actually could see something that a computer cannot.
Now that the season is over, we can answer that question, so let’s get to the results.
First up are the hitters. Let’s start with those I thought would do better than their projections. Those were Justin Upton, Alex Gordon, Delmon Young, Robinson Cano, Ichiro Suzuki, Evan Longoria and B.J. Upton. Right away, it’s easy to see that some of these hitters indeed beat expectations while others actually went the opposite way.
Overall, however, though we projected this group to have a .779 OPS (weighted by the number of plate appearances they had this season). In actuality, they posted an .819 OPS, which amounts to a 40-point difference! (Actually, 41 after rounding.) So far, so good—I thought these hitters would beat their projections, and in sum, they sure did.
So what about the hitters I thought would do worse than we projected? That list consisted of Chipper Jones, David Ortiz, Miguel Cabrera, Mike Napoli, Carlos Delgado, Jack Cust, Ryan Howard and Chris Davis. Again, we have a fun mix of guys, and overall the THT projections had them posting a combined OPS of .934 this season. Instead, they’ve posted an .843 OPS, or 91 points below expectation. That’s another big win for me—I’m two-for-two!
Let’s move on to the pitchers. I thought that Dan Haren, Clay Buchholz, Rich Harden, Mark Buehrle, Edinson Volquez, Zach Greinke and Francisco Liriano all would beat their projections. Perhaps Greinke’s name tips you off as to how I did with this group—overall, our projections had them posting a 4.33 ERA, but they blew that out of the water, combining for a 3.63 ERA instead. That’s a huge difference, and I have to say, my predictions are looking good thus far.
We still have one more group to look at, though, and that’s the pitchers I thought our projections overrated. They were Derek Lowe, Fausto Carmona, Jeremy Bonderman, CC Sabathia, Justin Duchscherer, Dana Eveland and Joe Blanton. Our projections thought these guys would be good for a 3.69 ERA this season; instead, they came in at 4.58, a whole 89 points below expectation! Clearly, I’m a genius.
Or am I? After I wrote my column, some suggested that my predictions were indeed going to be right, but that rather than being a feature of my extraordinary brilliance, it was merely an indication that the THT projection system was not very good. That’s a double whammy for me—not only does it call into question my intelligence, but I also designed the guts of the THT projections system. I think it’s fair to ask whether there is some bug in the design of our projections that allowed me to beat them.
To answer that question I looked at what another projection system said about the four groups of players we just examined. Essentially, since I did not consult any other projection systems when making my predictions, the other system can be used as an independent control: If my predictions turned out to be right simply because I was taking advantage of some defect in the THT system, another system would have the players correctly projected. If, on the other hand, my gut was able to see something a computer could not, any computer-based system would have been off for these players.
I turned to CHONE, which has been shown to be one of the best projection systems over the past few years. CHONE is also a completely computer-based system, making it ideal for this test. Due to the nature of statistics, CHONE should do a better job projecting these players than THT did no matter what; if I had chosen the 29 CHONE projections I hated most at the beginning of the season, the THT projections too would have been closer to the truth.
Let’s start with the hitters I thought would beat their projections. CHONE predicted they would post an .796 OPS, a number they bested by 23 points, OPS’ing .819. As for the hitters I pegged to underperform their projections, CHONE thought they would combine for an .898 OPS as group; they actually finished with an .843 OPS, which is 55 points worse.
The pitchers I thought would out-perform expectations got a 3.97 ERA projection from CHONE; they managed to beat that by .34, at 3.63 ERA. CHONE gave an overall projection of 3.75 to the pitchers I thought would underperform; instead, they had a collective 4.58 ERA, a whole 83 points worse.
To recap:
Hitters THT CHONE Actual Better 0.779 0.796 0.819 Worse 0.934 0.898 0.843 Pitchers THT CHONE Actual Better 4.33 3.97 3.63 Worse 3.69 3.75 4.58
Overall, the CHONE results confirm that my predictions were spot-on! Though the CHONE projections were closer to the ultimate truth than THT’s, they still were too low for the players I thought would beat their projections and too high for those I saw faltering. In other words, I do appear to be some sort of genius.
Well, not really. For one, I have no idea why I was able to beat two very good projection systems at their own game. My expectation was that a computer would be much better at assimilating a lot of statistical information into one final prediction than the human brain, and while I still do believe that to be the case, it does appear that we humans can see something computers do not.
Looking at the hitters I thought would beat their projections, I saw a lot of special skills, most of them young, but all very talented. Not all have capitalized on their abilities (*cough* Delmon Young), but overall, I think this is the kind of situation in which a scouting eye can tell you something that cold, hard numbers cannot (not that I have a scouting eye, but even I can see insane talent like the Upton brothers).
The hitters I thought our projections overrated were mostly some combination of old, fat and strikeout-prone. That’s never been a good combination, and though the statistics should bear that through, perhaps the computers aren’t quite as quick to catch on to when a player is going to falter due to those factors as a human can be.
The pitchers are a little more difficult to classify. The only thing that really jumps out at me is that I liked a lot of high-strikeout guys, while a lot of the pitchers I didn’t like are below-average at whiffing hitters. I think it’s very possible that pitchers with big arms often can break out in a way their past statistics would not predict, while those with low strikeout rates walk a very fine line between successful major leaguer and batting practice tosser. Perhaps we humans are a little better at seeing that line than computers.
But maybe not. Honestly, though I am fairly convinced that computer projections are not perfect, and that a baseball-crazed human being can pick out some numbers that just aren’t right, regardless of their statistical validity, I can’t say at this point that I know why. The human mind is a complex machine, more complex than any supercomputer yet built, and so it is not simple to decipher what processes exactly allow us to better a computer projection with our gut.
The important lesson here, however, is that human analysis does indeed have something add in understanding a player’s abilities and talents beyond what a computer projection will tell you. Computer projections are very good, and 99 percent of the time, they’re as good as or better than what we can do, but that other 1 percent—well, that’s where we analysts come in handy.
David Gassko is a former consultant to a major league team. He welcomes comments via e-mail.







 
What happens if instead of downweighting low PA players like Alex Gordon, we add in a league average or replacement level player for the rest of the PAs?