Matt Cain sacrifices goats

Earlier this winter, The Hardball Times offered prospective fantasy baseball writers the opportunity to compete in a Hardball Times fantasy league. Entrants wrote fantasy baseball articles, the best of which would be chosen as our winner. While we could only choose one winner to play in the league (congratulations, Dave Chenok), we had so many great articles that we have decided to publish some of the best. This is one of those submissions.

What? You got a better idea? Just how would you describe a pitcher for whom “lucky” doesn’t even begin to capture his defiance of all laws of probability. Think I’m being hyperbolic* here? Well consider that of the 85 starting pitchers with at least 1,000 career innings pitched and who played since line drive and home run/fly ball percentages were recorded, Matt Cain’s career ranking for liners is eighth lowest, while he is lower than everyone homers per fly ball.

That’s just the half of it, though: He is also the holder of the very lowest figure when it comes to batting average on balls in play and has the fifth highest left on base percentage of all of those same pitchers. In other words, the combined paucity of line drives allowed, balls in play that turned into hits, fly balls that became home runs, and baserunners who made it all the way home stands in open mockery of what we believed to be the unassailable truth that extremes will, sooner or later, regress to the mean. Hence the goat thing….

* (When I taught “hyperbole” in English class, I’d slam my fist and scream “I’ve told you a hundred thousand times what hyperbole means.” Sadly, they didn’t share my estimation of my cleverness.)

But I’m not saying that there is no such thing as unsustainable pitching luck … far from it. I’d be willing to bet that, since you’re spending your time reading saberist sites such as this one, we could look at the draft summaries from the leagues you just finished playing in and see strong pitchers you picked up for chump change because they just couldn’t catch a break on balls in play the year before; or, conversely, that you didn’t bite on the guy that everyone else thought had a real breakthrough last year because you knew he couldn’t possibly keep so many fly balls in the park again.

I, for one, must admit to the occasional look back at my auctioneering brilliance, though I should probably share with you three words that kind of put a damper on my celebrations of self: 1) ESPN. 2) 10-team. 3) Mixed. That is, if I sat down to draft against you all, instead, do you really think I alone would adjust my rankings to account for last year’s luck? Of course not … veering from the crowd’s ADP isn’t just something that the LABR guys do.

So what do you do when the low hanging fruit is no longer there? I suggest, in the great tradition of sabermetric thought, that you go against the crowd; that is, now that everyone knows to adjust for luck, don’t do it! Yep, my solution to everyone else getting smart is, er, to engage your inner Tim McCarver, say “xFIP who?” and look forward with anticipation to Felix Hernandez leading you to ERA nirvana.

That’s right, I’m betting Felix’s ’10 suggests a similar ’11 despite the fact that he wouldn’t have bagged a 2.27 ERA without such feats as limiting hitters to an absurdly low 16.3 line drive percentage, .273 BABIP and 8.5 HR/FB, as well as stranding an astounding 77.4 percent of his baserunners. And on what grounds? Nothing short of having good ol’ great “stuff.”

I’m serious, actually, but it’s a little more complex than I just made it sound. I believe I have found a small, but very real, instance in which we can explain luck, paradoxical as that may sound. Specifically, we have in Felix’s statistical record the sabermetric evidence that Hernandez wields what a scout would call “plus” pitches, by which I mean that in the Pitch Value section of his Fangraphs player page he has thrown, in each of the last two years, both a fastball and a change-up that saved at least a full win of runs more than average (e.g. in ’10 his wFB was 25.5, or worth 2.5 wins above average, while his wCH was 18.7). And it is with that specific arsenal that I believe (and have the numbers to back it up) a pitcher can control all those things we have come to believe he can’t.

His xFIP (which I’m using both because players are starting to rely on it when projecting next year’s performance, and because it is the one ERA variant that counts all four metrics discussed here as measures of luck) is, in its full point difference, very clear that we should expect a significant regression. Now, regressing from 2.27 can only be so bad, so I’m not saying that my hypothetical league rivals are going to rate him below a slew of others, but rather that you may be able to take him a good five or 10 bucks lower than those who downgrade his projections on the basis of his apparent luck, in turn making him the rare ace to justify his cost and maybe even then some.

(Just a quick look at the last couple of NL LABR drafts suggests that more “expert” players do indeed diverge from the crowd when there is strong evidence of luck at play. Those starting pitchers whom xFIP predicted would regress the most almost universally went cheaper in the following year’s LABR draft than would have been the equivalent cost of where they ended up on the general public’s ADP).

Okay, I’ll finally get to the goods:

The odds that any given pitcher will end up among the top 10 “luckiest” according to BABIP (or one of the other three metrics) in any one year is about 12 percent. Of the nearly 700 qualifying starting pitcher seasons since the onset of batted ball data in 2002, there were 14 pitchers (before Felix in ’10) who had a dominant fastball/change-up combo. Sample size, I know, but just look at how extreme their numbers compare: 29 percent of the time such a pitcher is on the BABIP list, more than double average, and the same 29 percent goes for HR/FB. Impressive, but then there’s LD percentage, where they have a 43 percent chance. In left on base rate, the odds rise all the way to 50 percent.

Even when you compare those odds to pitcher seasons with any other combination of two “plus” pitches (of which there were 33), it’s not even close: Their odds, like the overall average, remain in the teens for all but LOB percentage, which occurred an impressive one-third of the time. (That in turn suggests that we should really be wary when using FIP and the like to project pitching studs, insofar as it appears elite pitchers do have an actual ability to strand more runners than others).

But that still means fastball/change-up pitchers are half-again as likely to rank at the top of the LOB list. As for commanding a single dominant pitch of any type, there is zero evidence of any significant impact on these metrics (too bad for my hopes to eventually explain Cain via his sick fastball).

Probably the starkest difference was the huge influence the FB/CH group had on line drive percentager, while even those with other plus-pitch combos didn’t shift the odds an inch. (You might think, then, that tERA would be the way to go, but remember, we’re talking about just a tiny fraction of all starting pitchers, so in most cases you actually wouldn’t want to consider batted balls a skill as that metric does.)

A Hardball Times Update
Goodbye for now.

Why should this matter to you? To put it in the clearest fantasy baseball terms,line driveLD percentage influences your ratio categories in a significant way. Felix, for example, with a LD percentage three full points below average, lowered his ERA by .2 through that form of control alone. (I’ve gone on way too long to add in all the calculations here, but you can replicate them easily using standard linear weights).

Of course, none of this would matter were it not for the fact that these players’ performances the next year continued to defy advanced RA metrics’ predictions: A majority of the “lucky” FB/CHers had, on average, a following year ERA lower than xFIP predicted by an average of a half-point.

Ultimately, this is just one small way to get an edge on your opposition, but in an age when so much advanced data is freely available to all, knowing any way to play this knowledge against your league mates can only help. And there are surely other occasions in which to employ this angle; Clay Buchholz, for example, is the 32nd starter drafted at Mock Draft Central— going that late largely because of the appearance of massive luck last year—but if this 26-year-old can improve his change-up only slightly (2010 wCH: 7.7) and maintain anything near the quality of last year’s fastball (20.8 wFB), he just may reward his owners by delivering crazy low ratios for a second year in a row at a very, very nice price.

In the end, it’s just classic saberist thinking, no? When others zig, you zag, and then when they learn that they should have been zagging, you gotta find a way to zig to your advantage.


11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
DrBGiantsfan
13 years ago

Wow!  You taught English?  Man, that was all over the place.  I’m no sure there was a single completed line of thinking in the entire article! 

You do start to get at something I think is the tip of the iceberg in statistical analysis, evaluation of pitch type and the relationship to statistics.  I think once we get through sifting through the mountain of PitchFx data, a lot of things we currently attribute to luck will be understood as not due to chance at all.

So, if the chances of a pitcher being “lucky” in any single year is 12%, what are the chances that Matt Cain has been lucky for all 5 years of his career so far?

Mark
13 years ago

Wow!  Excellent article!  I agree with DrB that Pitch F/X data can certainly help demystify some of this information, but you seem to have found a fairly striking ineffeciency in evaluation here.  Also to add to DrB’s point, what we tend to call “luck” is often just shorthand for “results with unclear associated patterns and/or causes.”  With causes being uncertain and patterns unidentified, its impossible to interpret the data scientifically.

A few questions I think are raised here: 

-Did you include pitchers with dominant splitters in your data?  Based on your wording I’m assuming not, but the split would seem to be the most similar secondary pitch to the changeup, so it would be interesting to see if the results are more similar with dominant FB/SF guys than FB/CB or FB/SL.

-Does this suggest that change in velocity has more impact on inducing weak contact than other elements of pitching such as location and movement?  Just because of the nature of the changeup this seems to be the suggestion, but the example of Felix Hernandez immediately seems to contradict this intuition, as he throws a very “hard” changeup (avg CH was only 4.3 mph slower than average FB), and the velocity differential between his FB and CH has consistently decreased during his career while his results have improved.  Specifically his BABIP and HR/FB have appeared more favorable as his velocity differential has decreased.  If its not velocity differential, what is it about the changeup that causes this effect?

Will Hatheway
13 years ago

@DBG – I fear I wouldn’t give myself a very good grade for “getting to the point” (I’m sure one of many reasons this effort didn’t take the cake in the contest)! If it helps, my own scholarship was in critical theory, where loopy-ass thinking (that’s a technical term) is par for the course, so I kinda do one thing and teach another…

Thanks, though, for finding some relevance in all that verbiage: a lot of the greatest sabermetric minds discount things like pitch value, and I see their point, but I do think it (along with PitchFx data) offers an inlet into complicating some assumptions we’ve become too comfortable with regarding luck versus yet-to-be-explained.

Regarding Cain’s odds of being so “lucky” (damn, I totally passed up a play on his last name and my sacrifice joke!), it is ridiculous relative to his peers: along with Felix, Marquis, Johan, Zito, Washburn, and Zambrano make the top-10 career list in two of four categories. No one has three, and only Cain has four.

Johan’s case is actually striking in terms of my argument here: his two “luckiest” BABIP and LOB% years came in years with ++ FB/CH seasons…

Will Hatheway
13 years ago

@Mark – Thanks much for the kind words … my ego is fragile!

More importantly, you raise a number of really important points that deserve additional thought, but off the cuff I can offer the following:

I wonder if pitch-classification has changed such that splitters are now seen more often to be changeups, or if there has instead been an actual move by pitchers from the form to the latter because, quite frankly, SFs showed up in the initial Pitch Value years (which I did include, FYI) but not later on.

But you really might be on to something when they were identified, for it was only when they were paired with a plus-fastball that they made regular appearances on the “lucky” lists: 33% babip and ld, 17% lob and hr/fb makes them appreciably significant, for sure. In fact, this combo is the second most predictive in those two categories than any other, with only the VERY small sample-sized curve/slider effect on LOB% (which seems to make a lot of sense) having more of an impact on any one category. What is more, your specific request about comparing those two (slider and curve) in concert with a plus-fastball also registers very high connection with LOB% but nothing else.

Count me as seriously impressed that you just intuited all that without my laborious travel through the numbers… and your second point only builds on what I’ve found to suggest a connection, insofar that the major difference—as I understand it—between a split and a change is velocity, meaning that their shared downward movement versus a four-seamer is what really distinguishes them from the latter regardless of relative speed.

Will Hatheway
13 years ago

P.S. “Plus” splitter pitchers who also had another plus-pitch (universally a fastball) came once in 2002 – the first Pitch Value year -(Schilling), twice in ‘03 (Hudson and Mulder), once in ‘04 (Clemens), Clemens again in ‘05, and then Hudson repeated in ‘07. Since then, ZIP.

garik16
13 years ago

Sorry, not buying it.  The problem with your analysis is that pitch type run values are not expected run values…..they’re heavily IMPACTED BY BABIP to begin with. 

Meaning a pitcher with a lucky season, who happens to throw two pitches frequently, will fit your requirement of having -10 run values (+10 on fangraphs) for two pitches pretty often. 

So luck creates your criteria, which you claim allows players to defy luck.  Ummm not so much.

Your findings are entirely sample size.

Derek Carty
13 years ago

Will,
Josh (garik16) beat me to the punch.

“Johan’s case is actually striking in terms of my argument here: his two “luckiest” BABIP and LOB% years came in years with ++ FB/CH seasons… “

The reason Johan had those ++ FB/CH seasons are because the Pitch values given at FanGraphs are based entirely on results.  All of the actual results are factored in, and this includes things like BABIP and HR/FB in addition to stabler factors like Ks and BBs.

Those ++ FB/CH seasons are going to be driven by a pitcher’s good luck, not visa-versa (at least in large part).

That said, some pitchers can control their BABIPs and HR/FBs to some extent.  Johan, for instance, has a .275 career BABIP.  Having faced nearly 8,000 batters, it’s highly likely that something about how he pitches allows him to post a better-than-average BABIP.

Will Hatheway
13 years ago

@Garik and Derek –

I wrote this with two thinkgs raising hairs on the back of my neck, signaling “Warning! Warning!”: sample size, and what I call the chicken-and-the-egg nature of how Pitch Values are calculated. Really all I’m trying to do in pushing on regardless of those two clearly valid points you guys raise is that EVERY pitch value is the product of the same formulae, and so why is it, I wondered, that one exhibits way more luck than any other, or combination of other, pitch types? I still hold that this, however flawed, invites some sort of exploration and just wanted to raise one way of doing so. I will be quite content to see someone do something much more rigorous with it, as I’m very much an amateur, but I do think some sort of consideration—despite sample sizes and how p. vals are calculated—is warrented considering that all pitch types are devised by the same formula and yet this particular combination leads to far more extreme “luck-metric” results year in, year out.

Will Hatheway
13 years ago

Derek –

p.s. are you not with THT anymore? Was looking forward to checking out your draft this year (or are you “just” doing LABR/Tout-type stuff?)

Jeffrey Gross
13 years ago

Derek, is there research as to what extent a pitcehr can control their HR/OFFB? I know that HR/FB rates tend to vary because some pitchers have that IFFB Skills, but I thought that HR/OFFB was substantially more stabilized for pitches (low variance amongst starters).

Also, re: BABIP, babip and whip for pitchers tends to correlate to FB% and GB%. Higher GB% tends to lead to a lower xERA but higher xBABIP./WHIP, while higher FB% calls for higher xERA, but lower xWHIP

Derek Carty
13 years ago

No problem, Will.

I am still with THT and I should be resuming writing shortly with some (hopefully) very interesting topics.  I also will still be doing LABR, Tout, and CardRunners, in addition to probably two more expert leagues, so there should be plenty of my drafts to scrutinize.  CardRunners has actually started back up for the year over at the league’s site (http://www.crfantasybaseball.com/) with a post by Eric Kesselman and one from me.

Jeff,
There is very little difference between HR/FB and HR/OF in terms of stability.  As a very crude, quick measure to show you, using all pitchers since 2002 with 100 IP in adjacent years, HR/OF has a 0.1873 y-t-y correlation and HR/FB has a 0.1906 y-t-y correlation.  Because IF FBs only make up about 10% of all flies, it really doesn’t make much difference excluding them in terms of stability.