It’s all but final, but Randy Johnson is headed back to Arizona after two disappointing years in the Big Apple. It’s not that Johnson pitched poorly; he won 17 games each year, and while his 5.00 ERA in 2006 was bad, it was in a league with an average ERA of 4.56. He was also in the toughest division in baseball for a pitcher was a lefthander pitching in front of a third baseman who struggled defensively and a shortstop whose range has always been maligned by baseball’s number crunchers.

If Johnson were a 23-year-old rookie in 2005, the Yankees would be more than pleased with his efforts, even with the 2006 ERA, but expectations for Johnson were understandably higher. He pitched six years in Arizona, won four straight Cy Young awards, suffered through an injury-shortened season, and finished second in the Cy Young race in a year that just might have been the best pitching of his career. The Yankees expected more after trading for him and giving him a two-year, $32 million extension.

He wasn’t a Cy Young contender either year, and he failed miserably in the playoffs both years as the Yankees failed to advance past the first round. The Yankees have been frustrated with not winning the World Series ever since Luis Gonzalez blooped a single off Mariano Rivera. The frustration swelled after the 2004 choke against the hated Red Sox. Johnson was supposed to ease this frustration. He didn’t; instead he just added to it.

Johnson’s PECOTA projection for the Yankees is 3.52, according to Nate Silver in this post. Dan Szymborski’s ZIPS has Johnson at 3.63 after moving back to Arizona. My own projection system, CHONE, has Johnson at 3.75. Marcel projects Johnson to have a 4.33 ERA. Baseball Info Solutions (Bill James handbook) projects a 3.98 ERA. Other than Marcel, those are very optimistic projections for a pitcher coming off a 5.00 ERA who will turn 44 before the 2007 season ends. So what is going on here?

Thanks to sites like Retrosheet and Baseball Reference, we can find that Johnson, pitching from a full windup with no one on base, held hitters to this line: .206/.271/.324. With runners in scoring position, they hit .348/.399/.608 against him. That’s the difference between facing Cristian Guzman and facing Albert Pujols. As a result, Johnson allowed many more runs than you would expect given his hits allowed, walks, strikeouts, and home runs.

I use a Pitcher Baseruns (see below) formula to predict runs allowed and ERA for my pitcher projections. Looking at all pitchers with 50 or more innings pitched in a season from 1957 to 2006, Johnson’s 2006 season was the worst in terms of underperforming base runs in the last 50 years. He “should” have allowed 98 runs, he actually allowed 125.

What the projection systems listed above, other than Marcel, assume is that when pitchers outperform or underperform their predicted runs allowed, it is a fluke and is not repeatable. There is certainly an element of luck here, but there are other reasons why a predicted ERA may miss for some players. Here are a few:

- Knuckleballers like Charlie Hough allow a ton of passed balls, which are not included in the formula, but they allow baserunners to advance and score.
- Some pitchers have better defensive support than others. Defensive range, the ability to turn hits into outs, should be accounted for as that is part of the pitcher’s actual hit total, which we are using in the calculation. A defense that is especially error prone would not be accounted for by this formula, and will hurt the pitcher. A defense that is especially efficient in turning double plays, or outfielders with strong arms that prevent runners from advancing would also be a source of error in the measure.
- Catchers who can stop the running game, or catchers on whom runners steal with abandon would be another source of error. The pitcher himself has great control over the running game, and some pitchers help themselves by picking a few runners off every year as well.
- A formula that has hits allowed, but not doubles and triples allowed, may be off a bit. It will underrate groundball pitchers and overrate flyball pitchers, but the effect is small. Not counting homeruns pitchers generally allow 1.27 bases per hit. For extreme groundball pitchers its 1.25 bases, and 1.29 for extreme flyball pitchers.
- Some pitchers may have a better ability than others to pitch to a situation.
- Some pitchers might be hurt more by pitching from the stretch as opposed to pitching from a windup. It is possible that an old pitcher with a back injury like Johnson is hurt by this problem. If this is the case, his offseason back surgery may correct the problem.
- The formula may not work very well for an extreme wild pitcher like Nolan Ryan. Ryan consistently allowed more runners to score than my formula predicts for him. The baseruns formula gives a small base advancement weight to the walk, much smaller than to the hit by pitch, with the reason being that walks tend to occur in non-random situations, more often with a base open when they hurt you less. When Ryan was walking 200 batters per year, his walks may have had higher advancement value, since the odds are he already walked the last batter, who now moves to second base.
- There are also, to paraphrase Donald Rumsfeld, the unknown unknowns. Things that affect how many runs a pitcher allows that we haven’t even thought of.
- And of course, sometimes it really is just dumb luck.

The question I ask: is Johnson’s allowing more runs to score than his stats predict is an ability (or perhaps a disability)? I first looked at the 25 pitchers who allowed the most runs above expected runs over the last 50 years. They ranged from 21 runs to 27 runs above expected. Of this group, four did not pitch at least 50 innings in the following year, and two more made the list for their 2006 performance (Johnson and Taylor Buchholz). In year one, these pitchers averaged 23 runs over expected per 200 innings, and in year two they averaged 2.1 runs above expected, suggesting that at 200 innings this figure should be regressed 90% to the mean.

I then looked at all pitchers with 50 innings in back-to-back years for the sample. The sample increases from 19 to 8,419. I gave each pitcher equal weight and looked at runs over/under per nine innings. The average innings pitched for this group is 143, and the correlation coefficient is .076, which is significant at the 1% level.

We can use this formula to estimate how much this ability will persist:

IP

———— = X

IP + 1800

Looking at the sample of the worst 19 underachievers, average innings were 200, and X = 0.10. Looking at the sample of 8,419 pitchers, average innings = 143, and X = 0.074, almost exactly the correlation coefficient from that sample.

Here’s what this means for Johnson: Using his 2006 numbers only, with 205 innings and +27 runs, we can expect him to allow 2.7 more runs than his statistics predict. Using Johnson’s career numbers, he has 3,798 innings and has allowed 65 more runs than expected. Plugging those figures into the formula and we get 2.3 runs.

There are reasons why Johnson may fail in 2007. His injuries could be more serious than we expect. He could suddenly feel his age and lose his ability to pitch, like David Cone in 2000. If he’s able to handle the starting workload, strike out batters, limit his walks, and not get hit too hard, we should not expect him to allow 20 runs above his baseruns again. Instead, just take your favorite projection system and add two to three runs to it.

This trade seems to be a good one for all parties concerned. Johnson did not seem happy in New York, and certainly was not loved by the Yankee fans. He was traded two years ago because he did not want to play for a rebuilding team and wanted to be a Yankee. Now he seems to have changed his mind. It’s a good thing for him that the Diamondbacks actually want him back. With Cy Young winner Brandon Webb and solid innings eaters like Livan Hernandez and Doug Davis, plus a fifth starter to be named but probably a 24 year old with a name like E. Gonzalez, the Diamondbacks have a strong starting rotation to go with a lineup full of young talent. If Johnson pitches to his projections, this team could easily contend in a somewhat weak division.

For the Yankees, they save money, at least for now, and continue to rebuild their farm system. While Nate Silver correctly points out that the Yankees may have traded not their #5 starter, but their #1, the Yankees still might come out ahead on this deal. They could still use the savings to sign a comparable pitcher, for example by scanning the remnants of the free agent market for mid-40s future Hall of Fame pitchers.

Obi-Wan Steinbrenner made the decision to pin the Yankee’s hopes on Johnson two years ago. As he watches Randy leave, he is heard saying that boy was their last hope. But Yoda Cashman responds: “No. There is another.”

**References & Resources**

Pitcher Baseruns: Baseruns are the creation of David Smyth and explained on Tango Tiger’s website. The formula I used differs slightly:

A = Hits + BB + HBP – HR

B = Hits *1.214 + HR * .729 + BB* .057 + HBP * .182 + Balks * 1.05 + WP * 1.17 – K * .046

C = Innings * 3

And baseruns = A * B/(B+C) + HR