Luck, Leaps and Lapsesby Matthew Carruth
March 01, 2007
The recent writeup of John Beamer's on projections for some pitchers in 2007 led to an informal exchange between he and I over J.J. Putz and the merits of his improvement in 2007, and whether it was repeatable. Unsurprisingly, most projection systems do not think so, but that's because they fail to grasp the driving forces of J.J.'s transformation. It's not just the projection systems though, it's most people, even baseball analysts, can have a hard time understanding what happened unless you actually saw the transition yourself. Lucky for me, I am a MLB.tv junkie and also had access to the in-house ESPN network and from that, learned why J.J. in 2006 was nothing like J.J. circa 2004 or 2005.
The emergence of a new pitch for Putz, a splitter he picked up from Eddie Guardado, completely changed his profile as a pitcher. In literally the blink of an eye he went from just another hard-throwing fastball guy with average control and no secondary out pitch, to a bona fide two plus-pitch pitcher.
How often does this happen? Quite frequently in fact. While statistical research has done wonders into breaking down the hitting side of the game, including the aging and development curves that apply very well to hitters, the pitching (and defense) side are still lacking. Most notably, the aging and development curves on pitchers (and defenders) is all over the map. The reason is that pitchers do not really follow a curve of development like hitters do. Instead, it's more of a series of plateaus, a step-wise function for those of you out there keen on math.
Five, 10 years ago (and still today among the people mostly not reading sites like this one) ERA was the measurement used to evaluate pitchers. Now, most of us have come to realize that ERA is not a very good way to quantify a pitcher's contributions. For one, it is incredibly team-dependent, in that a pitcher's ERA is largely influenced by the defense behind the pitcher and we all now generally agree that the pitcher bears little control on a batted ball once it is put into play (the DIPS theory).
For that reason, and several others, we moved onto to other measurements, eventually settling on the three true outcomes; those being strikeouts, walks, and home runs. Those three categories served us pretty well in evaluating pitchers.
Why? Well, they correlated much much better on a year-to-year basis than ERA, suggesting that they are less influenced by random noise and better reflect a pitcher's innate ability. They also make sense if you buy the DIPS theory. To put it in biochemical terms, if ERA was the discovery of the cell, then DIPS ratios (as strikeout, walk, and home run ratios are sometimes referred to collectively) was the discovery of the atom.
And just like how the atom was eventually displaced by the discovery of quarks, I hope that we can find something better, or at least more complete, than the DIPS ratios. That is what I hope to accomplish with the IPORTs. One of the things that we can do right away with the IPORTs is to analyze how the outcomes are reached. For example, a pitcher's strikeout ratio is highly influenced by his percentage of swinging strikes generated (R^2 around 0.62) while a pitcher's percentage of called strikes alone bears almost no statistical influence on the pitcher's strikeout ratio (R^2 around 0.03).
Another way of thinking about this hierarchically is that we are moving in even closer. If ERA is the wide-angle, looking just at the end result in terms of runs allowed, and DIPS is the focused view on what the pitcher actually controlled, then IPORT is the zoomed in macro view of how the pitcher achieved those events.
As mentioned above, not all strikes are equal. Thus the distinction between how strikes are generated (called, swinging or foul) is a very important one to make. Esteban Loaiza provides a good example of this. Loaiza from 1995 through 2002 hung around a below average strikeout rate tied to a below average percentage of swinging strikes generated. Suddenly, in 2003, he introduced or got a handle on his cut fastball. From 2002 to 2003 his percentage of swinging strikes went from 6.52% to 9.08%, a sizable jump for a starting pitcher. Along with that, Esteban's strikeout rate climbed from 12.99% to 21.73%, a huge increase.
The introduction or perfection of a new pitch like Loaiza's cut fastball or Putz's splitter can routinely cause jumps in performance level like this that is not always subject to regression because they are not a randomly driven, but rather a skill driven spike in performance. However, Loaiza also provides us with an example of when regression can occur, namely when the motivator behind the performance spike (his cut fastball) goes away. Whatever he did in 2003 to harness the cut fastball, Esteban lost it in 2004 and his peripherals immediately returned to his pre-2003 established levels, including his strikeout rate and swinging strike percentage.
2005 saw a mini-comeback for Loaiza as his strikeout rate climbed back above average (though still below 2003 levels), and he cut his walks down. The problem is that the climb in strikeouts in 2005 was not supported by a rise in his percentage of swinging strikes. If I were looking at this data back in the winter of 2005, I would be predicting a regression in strikeout rate for 2006 back to his career norms, which is exactly what occurred.
J.J. Putz's 2006 is a lot like Loaiza's 2003. The introduction of a new pitch sent his percentage of swinging strikes through the roof and as a result his strikeout rate followed. When we discuss minor league pitchers and long-term projections, scouts almost always refer to a pitcher's ability to miss bats (or not). This is what swinging strike percentage is measuring, and I do believe that is just as important a number to track in a 10-year MLB veteran as it is for a kid fresh in rookie ball.
We'll get to more nitty gritty aspects of IPORTs in future articles, but for now, what I wanted to do was to find other pitchers like Putz and Loaiza, other pitchers who broke out and see which ones are likely the result of a true jump in performance and which ones are more likely flukes. The methodology for identifying the pitchers to investigate was as follows: Look at all pitchers who faced at least 200 batters in 2006 and at least 200 batters prior to 2006. Compare their 2006 ERA to their career ERAs after 2006.
Why include 2006 in the career data instead of excluding it? Because I wanted my ranking to be weighted by how much of a jump the pitcher actually made. Comparing a career starter with 5,000 batters faced in his career before 2006 faces a much harder time beating his career ERA than say a relief pitcher turned starter who had faced 400 batters prior to 2006. They might face the same number of batters in 2006, but by including 2006 in the career data, the relief pitcher turned starter does not have as much of an advantage over the longtime starter when it comes to besting his career ERA.
It gets really tiring typing percentage of swinging strikes over and over again (though I could just copy and paste but I like the way the shorthand reads), so in general, whenever I refer to IPORT statistics, I will abide by the following shorthand for categories:
The individual pitch percentages (expressed as percentages of all pitchers thrown) will be written as Type%. For example, percentage of swinging strikes will be written Swing%, percentage of ground balls as GB%. Pop ups or infield flys will be written as IF%.
As always, please refer back to the original IPORT article for full explanation on the various categories.
The following pitchers all bested their career ERA by at least 15% in 2006.
1. B.J. Ryan (57.05% lower)
Lowered his Ball% and walk rate to the lowest of his career, but lost ground on his strikeout rate and his GB%/FB% at the same time. The biggest factor in Ryan's disappearing ERA was his BABIP, which you will see mentioned a lot in this section. It fell from Ryan's established level of around .300 down to .258 in 2006. Ryan is due for some regression but should stay at his 2004-05 level of performance.
2. Joe Nathan (50.16%)
Had the lowest Ball% and highest Called% of his career. Did see a drop in his Swing%, but was more than made up for the increase in Called%. Nathan's strikeout rate rose slightly in 2006, and his walk rate plummeted. Unless he gains back what he lost in Swing% I would expect a slight dip in the strikeout rate for next year. He also saw a BABIP drop in 2006 from his established .260 level to .235 so expect a rise there too. Overall though Nathan still looks very strong.
3. Joe Beimel (37.15%)
Beimel's Ball% has been trending down of late, and 2006 was no exception as he cut another point off. Beilmel has also seen a steadily rising GB%, which again did not reverse itself in 2006, rising another two points. Beilmel is legitimately a better pitcher now than he has been in the past. However, he does rely a lot on his defense and our good friend BABIP was down in 2006 at .268 from Joe's normal level of over .300. Expect a lot of regression in the BABIP, especially if the increasing GB% continues, but Beilmel should still be an effective reliever.
4. Pedro Feliciano (35.69%)
Pedro's Called% climbed impressively, as did his GB%, but most everything else, including his BABIP, remained stable. As a result, his five point jump in strikeout rate looks a bit flukish to me. I would expect regression there, but stability elsewhere.
5. Dan Wheeler (34.03%)
Wheeler overall saw much of the same in 2006 as he did in 2005, but those two years have yielded vastly different results from his previous career. Two notable levels jump out: his strikeout rate which moved from about 16% pre-2005 to about 23% in 2005-06, but is not supported by any change in his individual pitches except for a gentle decrease in his Ball%, and his BABIP, which formerly was in the .300-.310 range, but was .237 in 2005 and .266 in 2006. It is not apparent yet whether Wheeler jumped to a level in 2005 and his new peripheral rate is sustainable or whether the low ERAs of the past two years have been all BABIP driven. I would lean more towards the latter, but he is definitely one to watch.
6. J.J. Putz (33.72%)
Putz is the textbook example of a level jump, the only thing left is for him to sustain it into 2007 and beyond. Putz's Ball%, which has been steadily falling, fell another four points in 2006. His Foul%, Called%, also both steadily on the rise, both moved up another point or two, and his Swing% jumped over six points. The end result was a 16 point increase in strikeout rate, a 3.5 point decrease in walk rate and a home run rate that was halved, all while the BABIP remained steady. The home run rate drop was not accompanied by any change in GB% or FB% so I would expect some regression there, but overall, unless he loses command over his splitter, Putz is going to remain one of baseball's best closers.
7. Duaner Sanchez (31.76%)
A lower FB% lead to the highest GB%/FB% ratio of his career, which in turn explains the lowest home run rate of his career. Beyond that, and a very small drop in BABIP, Duaner was essentially the same pitcher as ever. The FB% has been trending downward over time so I would be more bullish than bearish on a repeat of 2006 with a ERA near the 3.00 mark.
8. Trever Miller (31.52%)
Quite the opposite from Sanchez above. Miller cut his Ball% by over five points, saw his Foul% rise three points, had his Swing% crack 10% for the first time (a two point rise over 2005) and his Called% broke the 20% barrier for the first time as well, also a two point increase. All this adds up to a legitimate six point rise in strikeout rate and the lowest walk rate of his career. It's not all good news though, as Miller's GB% fell dramatically, pushing his GB%/FB% under 1.00 for the first time and leading to skyrocketing home run rate. The BABIP was steady though, so his 2006 performance is repeatable based on his percentages.
1. Roger Clemens (25.81%)
Roger lowered his walk rate by a little over a point, but other than that, he was the same pitcher he has always been. What is worth noting is that Roger's BABIP from 1988 through 2003 averaged out to .287. From 2004 onwards, Roger has posted BABIPs of .276, .243 and .266. Age, BABIP regression and a possible switch out of the NL should worsen Roger's performance in 2007, if he has one.
2. Chris Carpenter (24.45%)
Carpenter makes a good case study, as he turned on several dozen light bulbs and has not looked back since, going from a 15%/8%/3% pitcher to a 20%/5%/2.5% one. (strikeout, walk and home run rates). What do the percentages say? After 2004, Carpenter's Ball% has dropped three points, and he's seen roughly a two point rise in both Called% and Swing%. Interestingly enough, his LD% has jumped a clear two points, but the big driving force has been a three point rise in GB%. On the BABIP front, it was remarkably consistent pre-2004, hovering around .310. However, since '04: .277, .276, .273. All indications seem to point to 2006 as a typical post-2004 season for Carpenter.
3. Bronson Arroyo (21.85%)
Arroyo's 2006 looks exactly like his 2004 campaign except for one thing; BABIP. It was .288 in '04, but it was .269 in '06, so expect the ERA to rise a bit with his BABIP.
4. Jason Jennings (20.25%)
Jennings saw the standard BABIP drop, but here a case might be made that this reflected Coors Field playing more neutral than in previous years, as it only dropped to .290. Jennings also cut his walk rate by 1.5 points, fueled by a drop in Ball% of three points. Jennings also had the lowest GB%/FB% ratio of his career, making his lower home run rate seem like luck. He's moved out of Coors Field now, so his performance is harder to predict, but I definitely see some fall back to the mean in 2007.
5. C.C. Sabathia (18.48%)
Sabathia had a quiet year as his ERA improved, but not dramatically, but his percentages indicate a strong step forward in 2006. His Ball% was cut by over five points, his Foul% rose three points, his Swing% went up two points and he maintained the improved GB% from 2005 which kept his GB%/FB% ratio near 2.00. His strikeout rate went up and his walk rate sank like a stone to under 5% from his previous baseline of about 8%. If he hadn't had the highest BABIP of his career last year (still a paltry .294 compared to a .282 baseline) his ERA would have been even lower. Expect a strong performance from Sabathia in 2007.
6. Nate Robertson (15.79%)
Robertson was same pitcher he always had been. 2006 just saw a drop in BABIP of 11 points off of 2005 and some 25 points off his pre-2006 career levels.
7. Jeff Francis (15.27%)
All BABIP, all the time. Francis' percentages have remained relatively stable, but his BABIP went from .332 in 2005 down to .264 in 2006. That's a 68 point drop for those counting at home, shaving over 40 hits off Francis' line and likely accounting for the entirety of the improvement.
8. Kelvim Escobar (15.26%)
Escobar's Ball% has dropped from 38% or more pre-2005 to 36%, and while his strikeout rate has fluctuated quite a bit, it has been generally trending downwards, though his declining walk rate is (so far) outpacing it. The real culprit to Kelvim's ERA drop is the 2006 Angels defense which gifted him with 17 unearned runs. Kelvim's RA was exactly in line with his career.
Same as above, but this time in reverse order. The following pitchers all performed at least 40% worse in 2006 than their career ERA.
1. Derrick Turnbow (-81.75%)
A spike in Ball% leading to a doubling in walk rate coupled with a BABIP going from .246 to .333 tells you all you need to know. The good news is that his strike throwing ability remained stable so if Turnbow can regain limit the balls, he could return to being serviceable.
2. Jose Valverde (-66.86%)
This one was all BABIP as Jose was arguably a better pitcher in 2006. His Ball%, Foul%, Swing%, GB%/FB%, strikeout and walk rates all trended better, while nothing got substantially worse. However his BABIP soared from .272 in 2005 to .355 in 2006. Expect something more in line with his 2005 next year.
3. Brad Lidge (-60.49%)
Lidge's Ball% rose back up to 2002-3 levels while his Swing% has been declining fast since 2004 and this time his Called% didn't compensate, dropping his strikeout rate again, though not as much as 2004 to 2005. This time the walk rate also spiked, over two points and the home run rate nabbed him, jumping two points as well. No change in his overall GB%/FB% tendencies though so expect the home run rate to regress. Everything else is legitimate decline, and Lidge will need to regain those missed bats to be the pitcher he was in 2004.
4. Glendon Rusch (-48.90%)
Rusch's Ball% went up, driving a walk rate that rose three points. His GB% fell which partially explains the ridiculous 6.43% home run rate, but not completely. Something that extreme is almost surely going to regress. However his other bad trends are likely to continue, or at least not suddenly turn around. I would expect something more akin to a 5-5.5 RA in 2007.
5. Andy Sisco (-46.39%)
The namesake of his own award, Andy's Swing% went down a bit while his FB% jumped two points. The strikeouts went down (way down), the walks and home runs went up (a little) and the BABIP went up (way way up to .331). The BABIP should regress and if Andy can get the strikeouts back up a bit without losing any more ground on the walk front, he could begin raising the hopes of his new organization, the White Sox, before dashing them so expertly as he did in Kansas City.
6. J.C. Romero (-45.65%)
Romero has a seriously weird strikeout rate pattern. Three times in his career he has lost four points, as he did in 2006. The previous two times he bounced back, gaining eight and seven points respectively in the following year. The strikeout rate is tracking with his Swing% rate perfectly, so if he starts missing bats again in 2007 then you'll know he is probably fine. His BABIP also took a vacation way up north hitting the .333 mark after being consistently in the just under .300 range previously.
1. Mark Prior (-105.41%)
A complete mess in 2006, Prior saw his Ball% jump eight points and his Swing% get halved, leading to a nine point drop in strikeout rate and a three point spike in walk rate. The home run rate also hit a new high of 4.27%, which should regress a bit given that his flyball tendencies didn't alter much. The BABIP was spot on though at .290, so don't expect any help there. Mark Prior legitimately sucked in 2006, it wasn't because of bad luck.
2. Mark Mulder (-73.72%)
Mulder's Ball%, Foul%, Called% and Swing% all went in the wrong direction, leading to a lowered strikeout rate and a higher walk rate. The big culprit though was the home run rate spiking two points to nearly 4.5% and a BABIP jump from his consistent .285 level to .329. If the injury doesn't kill his stuff, Mulder should be fine going forward.
3. Josh Towers (-72.19%)
Towers had an increased Ball% leading to a 33% increase in walk rate. Other than that, more flyballs coupled with a horrendous home run rate, which isn't out of the ordinary for Josh, and a BABIP jump to .346 are a known recipe for a 9.00 RA. He's due for a lot of regression, though that's just heading back to average unless that walk rate goes back down or he learns how to limit the home runs.
4. Pedro Martinez (-59.43%)
Pedro suffered through the lowest GB%/FB% rate of career and a two point jump in Ball% leading to a small increase in walk rate. As with the other starters here, the home run rate was the biggest culprit, Pedro's jumping to 3.46%, a good 1.5 point increase. No BABIP issue though, Pedro just needs to get healthy, get his arm slot back, and he should be back to somewhere between 04 and 05 form.
5. Randy Johnson (-55.28%)
Johnson's Ball% is up and his Swing% went under 10 for the first time since 1990, telling us all we need to know about what his strikeout and walk Rates did (hint: got worse). His home run rate actually went down from 2005 and his BABIP was steady. He did have an unlikely to reoccur low percentage of runners stranded, which should bounce back to normal, but I don't see as much regression here as other analysts. If he were still with the Yankees I would be betting the over on a 5.00 RA, but back in the NL and in a warmer climate I'd adjust down to a 4.50 RA as a prediction.
6. Bruce Chen (-50.65%)
The Swing% continued its five year downward trend, this time coupled with a big increase in FB% and a loss on GB%. Unsurprisingly, his strikeout rate has also dropped for five consecutive years, but now the home run rate is over 6% and the BABIP was .338 instead of under .300, where Chen normally resides. He is due for some regression yes, but he's not likely to ever be anything other than mediocre, even by AL pitching standards.
7. Matt Clement (-47.87%)
Another tale of Ball% and Swing%, Clement saw his Ball% rise back up to 2001 levels and his Swing% dip below 10% for first time since 1998. As before, this equals a dropping strikeout rate (four points) and a rising walk rate (also four points). Unlike some of the others, Clement's home run rate didn't move and his BABIP, while higher than normal at .326, wasn't at an absurd level. Unless his stuff comes back, just a slightly better 2006 is what Clement is likely to be.
8. Oliver Perez (-41.97%)
Oliver Perez's entire game rests on how many bats he can miss. Nothing else really moves in his lines. When his Swing% is around 11% as it was 2002-04, then he can ride a 24% or high strikeout rate to a strikeout-to-walk ratio around 2.00 and be a useful pitcher. When the Swing% is at 9% like in 2005-6, then his Strikeout Rate slips to 19% and he's doomed. His elevated BABIP level in 2006 is the only difference between last year and 2005. Unless he starts missing more bats or somehow finds control, 2005 is what Oliver Perez will be.
9. Joel Pineiro (-41.96%)
Nothing really stood out from Pineiro's line. He just got incrementally worse all across the board. He's not as bad as his 2006 ERA, but he'll likely never be better than 2005 as a starter unless he rediscovers the stuff that he had before his 2004 arm injury.
Overall, what you see is what makes Marcel such a good projection system in the aggregate. The players who far out pitched their career norms did sue mainly due to luck, notably in BABIP. While those who went the other way were also hampered by luck, there was far more cases of legitimate collapses than legitimate improvements. Sixty-five qualifying pitchers performed at least 15% worse than their career ERA in 2006, while just 44 performed at least 15% better. There were many more starters performing worse, compared to many more relievers performing better. On average, the players who exceeded expectations are most likely going to regress back to their career norms as there were few actual performance breakouts and on average, the players who performed worse will either not sustain the bad luck they got in 2006 or, in the cases of actual deterioration, some will see more time out of the bullpen, which helps to improve their numbers.
References and Resources
The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at http://www.retrosheet.org.
Matthew Carruth is an editor for The Hardball Times. He welcomes any and all sorts of communication at his email.