There was an article yesterday by Chris Mulligan of the Fantasy Baseball Generals that dealt with an interesting topic. I started to type up a comment to it, but I realized that there is a lot to say on the matter and that I’d be better able to organize my thoughts as a full post.
In the article, Chris contemplates the concept of “pitcher tiering.” He notes that guys like Johan Santana, Tim Lincecum, and Jake Peavy (among a few others) are head-and-shoulders above the rest of the pitching crowd and deserve early round treatment.
He then wonders, though, why “second-tier” starters like Roy Oswalt, James Shields, and Jon Lester are taken in the early-middle rounds (rounds seven through ten) when they often post stats similar to guys like Zack Greinke, Derek Lowe, and John Danks — guys who will be taken after round 12. I’m not big on the cherry-picking of stats he uses (I could just as easily pick out a “second-tier” and “third-tier” pitcher whose stats are significantly different), but this leads into an interesting line of thought that I haven’t discussed here very much.
My answer to the question
The simplest answer I can give to Chris’s question is “opinion.” Opinions about pitchers vary wildly (yes, the same is true of batters, but this is true to a greater extent for pitchers). Last year, Lincecum would have been considered one of those “second-tier” pitchers that Chris says he will avoid in 2009.
Whether or not that is the right or wrong decision is a discussion for another time. What I’m saying is that a lot of the people who took Lincecum in, say, round eight, took him there because they thought he was a good value. They thought he was closer in value to a round two Jake Peavy than to a round 12 or 13 John Maine. They thought he could outperform the average round eight pitcher.
That those owners ended up being correct (or were incorrect in the case of a “second-tier” guy like Aaron Harang or Francisco Liriano) doesn’t matter. People will always have opinions on players and will act based on those opinions. A player’s market value (read: where he ultimately gets taken) is, simply put, the most favorable of these opinions.
All it takes is one owner to like Marco Scutaro enough to draft him in round five. As ridiculous as it sounds, Scutaro’s market value would be round five in that scenario.
This is a bit simplified (that owner would consider everyone else’s opinion of Scutaro when deciding where to take him), but the basic point remains: a player’s market value is determined by whoever is highest on him. And because opinions of pitchers vary more than for hitters, the situation Chris discusses exists.
The thing with pitchers is, people come to their decisions on them based on very different criteria. It’s hard for someone to credibly say that Ryan Theriot has a lot of ‘raw power’ and will hit 40 HRs in 2009; the guy is 5’11, 175 pounds and has seven career home runs. I’d bet, though, that we could find someone who thinks Andrew Miller‘s ‘stuff’ could allow him to post a 3.50 ERA in 2009. Just as easily, we could find someone to point to Miller’s 5.87 ERA or 4.94 LIPS ERA and say “no thank you.”
In addition to stuff, which we actually can quantify (in a way) with PITCHf/x data now, there are a number of other factors people use that are more abstract and subjective. Some people might say that Zach Duke — a soft-tosser who does not possess the traditional definition of ‘stuff’ — has great ‘poise’ or ‘command’ and is a breakout candidate in his age 26 season. Still someone else will argue, pointing to his three year streak of 5.00+ LIPS ERAs.
And oftentimes — this is where things start to get heavy, but very interesting — two owners can come to two very different decisions on the same player and both be considered correct. To illustrate this point, take a look at this line:
+---------+-------+------+------+---------+------+------+------+----------+------+ | LAST | FIRST | ERA | FIP | DIPS v2 | xERA | tRA* | xFIP | LIPS ERA | QERA | +---------+-------+------+------+---------+------+------+------+----------+------+ | Pelfrey | Mike | 3.72 | 4.02 | 4.09 | 4.29 | 4.58 | 4.70 | 4.74 | 4.91 | +---------+-------+------+------+---------+------+------+------+----------+------+
This is Mike Pelfrey‘s 2008 line with a slew of different ERA estimators. Let’s simplify things and pretend we are looking at a league of complete statheads, and the numbers are the only things they will look at. The question is, though, what numbers? FIP thinks Pelfrey’s ERA should have been 4.02 last year. Baseball Prospectus’s QERA thinks it should have been 4.91. That’s an 0.89 point difference — the difference between a twelfth round pick and not getting drafted.
Yet, if the person making either decision gives me the basis for their decision, I wouldn’t tell either one they were wrong. I would go as far as to say they were right.
On this point of either decision being correct, the fact is that there isn’t much difference between the top ERA estimators. I use LIPS ERA for two reasons: 1) because my tests have shown it is one of (i.e. tied for) if not the best of the predictors and 2) because I like the methodology of it better than any of the others.
While this individual decision on Pelfrey might be wildly different, the results on the whole are similar. In the long-term, someone using LIPS ERA (a complex ERA estimator) will do better than someone using FIP (a very simple ERA estimator), but the difference might not be as large as you might think.
I believe using LIPS ERA would be slightly beneficial in the long-term, but using FIP (or any of the other estimators for that matter) wouldn’t necessarily be incorrect. One might be preferrable to another, but they are similar enough where I won’t begrudge anyone who uses something different, and I’m sure you could poll some of the top analysts and get a couple of different answers on which they prefer.
(As a side-note, I don’t recommend using just one year of any of these stats to project future performance. It’s far better to look at multiple years as well as other factors in a complete projection system.)
So back to the matter of that second-tier/third-tier distinction, hopefully we now understand why we see these distinctions and why, inevitably, people will disagree. Even using something quantifiable, like stats, we can come to drastically different—but not incorrect—conclusions. Add in things that aren’t quantifiable (some of which are complete non-sense), and the gap in those conclusions gets even larger.
An owner in one of my leagues hates Carlos Marmol, and I have no idea why. His numbers are great, and his stuff is great. Yet this owner doesn’t like him. Perplexing, but a perfect illustration of my point.
I know that some of the concepts I’ve discussed are a little abstract, so if you have any questions, please feel free to e-mail me or comment. This is a topic that will be debated for a long time to come, but hopefully this has gotten you thinking and provided a little insight.