Why Oliver Loves Yu

WAR  ERA WHIP  W  L  IP   H HR  BB  SO HR/9 BB/9 SO/9
6.2 2.57 0.97 16  4 185 138  8  41 198  0.4  2.0  9.6

It looks like Yu broke Oliver. That’s Yu Darvish; Oliver is the engine of The Hardball Times Forecasts. It’s not the first time it’s happened, but when a player so dominates his non-major league competition that that his derived major league true talent exceeds generally accepted norms, it offers an opportunity to examine the system and make some changes for the better.

Darvish’s performance against batters in Nippon Professional Baseball, the world’s second best professional league, is indeed mind-boggling: consistently low hits, home runs and walks, with more than a strikeout an inning.

Patrick Newman of npbtracker shows pitch type, velocity and usage rate for pitchers in that league. This past year, Darvish’s fastball sat at 94 to 95 mph, with a slider in the low 80s, and a high 80s change-up. He also mixes in a low 90s cut fastball, forkball, shuuto and slow curve.

Newman also pointed me to Pro Yakyu Nuru Data Okijyo from which I was able to get Darvish’s ground ball rates.

Year Age  ERA  W  L  IP   H HR  BB  SO   GB%
2007  20 1.82 15  5 208 123  9  49 210  59.9
2008  21 1.88 16  4 201 136 11  44 208  57.8
2009  22 1.73 15  5 182 118  9  45 167  59.2
2010  23 1.78 12  8 202 158  5  47 222  57.4
2011  24 1.44 18  6 232 156  5  36 276  60.0

Still the question remains, how accurately can that performance be projected into a major league equivalent? The standard process is to find as many players as possible who have played in both leagues, comparing their performance, as a group, in both situations.

If, for example, starting pitchers might translate differently from relievers, players can be divided into different groups that better fit their role and profile, but at the risk of having the comparisons based on smaller, and thus less reliable, sample sizes.

Oliver’s Japanese translations are based on the performances of 260 pitchers who have performed on both sides of the Pacific from 1998 to 2011. Of these, 185 have been North American players who have gone to Japan, with 75 Japanese pitchers coming here, but only 28 of those 75 appearing in the major leagues. Since 1998, only five pitchers who were starters in Japan were given starting roles in the majors.

Oliver is rule based. Given a supply of play by play and seasonal data, I write code that describes how different parts of the data relate to one another. If I believe Darvish’s translations are too strong, adjusting the code will also affect every other Japanese pitcher. Changes must be made in a way that balances the performances of all in the group. There did appear to be differences in whether the pitcher started his career in North America or Japan, and whether he was a starter or a reliever. After adjustments were made, Darvish’s projection hardly budged.

With a projected 2.57 ERA, give or take a few tenths, Oliver is putting Darvish ahead of every current major league starting pitcher. The Texas Rangers were willing to commit $111 million dollars over the next six years to procure his services, but can he realistically be expected to out-perform this projected list of 2012′s top 15 starting pitchers?

 ERA Name
2.75 Clayton Kershaw
2.79 Stephen Strasburg
2.88 Justin Verlander
2.97 Roy Halladay
3.05 Cliff Lee
3.05 Josh Johnson
3.15 Matt Cain
3.16 Jered Weaver
3.17 Felix Hernandez
3.25 Ian Kennedy
3.25 Mat Latos
3.25 Adam Wainwright
3.26 Cole Hamels
3.28 Tim Lincecum
3.33 Michael Pineda

Let’s look at how Oliver’s past projections for Japanese starting pitchers compare to their actual performances. I will note that the major league performance is a weighted mean of the player’s first three seasons in the majors, with the first season weighted at 1.0, the second 0.7 and the third 0.5. This is the reverse ordering of how past seasons are used to generate the projections. No minor league data are included. Also, the projected ERA is based on the expected wOBA allowed, while the major league ERA is the actual, and not park adjusted.

Kei Igawa              Size  ERA   BH%   HR%   BB%   SO%
Projection             1788 3.89 0.297 0.046 0.072 0.218
MLB 1st 3 years         330 6.54 0.317 0.064 0.109 0.161

Igawa was signed by the Yankees in 2007 and was expected to provide an above-average numbers of strikeouts, although accompanied by a few extra home runs. Maybe the pressure of working for George Steinbrenner was too much; Igawa allowed far too many walks and long balls and lasted only 12 starts that year and one the next before returning to Japan.

Kaz Ishii              Size  ERA   BH%   HR%   BB%   SO%
Projection             1547 3.96 0.284 0.048 0.119 0.246
MLB 1st 3 years        1525 4.25 0.279 0.042 0.144 0.191

Ishii signed with the Dodgers in 2002, spending three years in their rotation. After one more with the Mets, he also returned to Japan. Wild in Japan, he walked even more here and also underperformed his projected strikeout rate, although the ERA projection was fairly close.

Kenshin Kawakami       Size  ERA   BH%   HR%   BB%   SO%
Projection             1381 3.50 0.284 0.044 0.046 0.205
MLB 1st 3 years         943 4.22 0.295 0.032 0.071 0.157

Kawakami joined the Braves in 2009 and had a respectable 3.86 ERA, but suffered through a 1-10, 5.15 year in 2010, then spent the entire 2011 season in the minors. He walked more and struck out fewer than projected (I’m beginning to notice a pattern).

Hiroki Kuroda          Size  ERA   BH%   HR%   BB%   SO%
Projection             1685 3.54 0.278 0.037 0.048 0.167
MLB 1st 3 Years        1520 3.65 0.283 0.025 0.045 0.170

Kuroda delivered four quality season from 2008 to 2011 for the Dodgers, almost exactly matching his projection, and just signed a 1 year, $10 million deal with the Yankees.

Daisuke Matsuzaka      Size  ERA   BH%   HR%   BB%   SO%
Projection             1630 2.77 0.273 0.030 0.061 0.245
MLB 1st 3 years        1517 4.01 0.295 0.039 0.105 0.221

The Japanese import everyone loves to hate, Matsuzaka did have two solid seasons, in 2007 and 2008, for the Red Sox, but injuries have kept him sidelined and/or ineffective for the past three years. Showing fine control his last two years in Japan, he’s issued an above-average numbers of walks in the majors.

Hideki Irabu           Size  ERA   BH%   HR%   BB%   SO%
Projection             1658 3.19 0.281 0.028 0.100 0.258
MLB 1st 3 years        1125 4.94 0.283 0.058 0.085 0.187

Hideo Nomo             Size  ERA   BH%   HR%   BB%   SO%
Projection             1707 4.40 0.291 0.040 0.157 0.243
MLB 1st 3 years        1884 3.16 0.269 0.035 0.094 0.275

Colby Lewis            Size  ERA   BH%   HR%   BB%   SO%
Projection             1479 3.26 0.302 0.034 0.039 0.230
MLB 1st 3 years        1431 4.03 0.273 0.046 0.072 0.220

I looked at three more pitchers – Hideo Nomo and Hideki Irabu from the 1990s, and Colby Lewis, who after never experiencing any success in the majors spent 2008 and 2009 in Japan before returning the past two years with the Rangers.

Irabu issued fewer walks but also fewer strikeouts than expected, and couldn’t avoid the long ball. Nomo was very wild in Japan but pitched much better than expected in the major leagues. Lewis’ strikeout rates were as expected, but his walks jumped up.

Hisanori Takahashi     Size  ERA   BH%   HR%   BB%   SO%
Projection             1355 4.27 0.292 0.047 0.066 0.175
MLB 1st 3 Years         713 3.60 0.294 0.037 0.068 0.215

Ken Takahashi          Size  ERA   BH%   HR%   BB%   SO%
Projection              940 5.28 0.293 0.052 0.088 0.133
MLB 1st 3 Years         116 2.96 0.280 0.026 0.113 0.200

Koji Uehara            Size  ERA   BH%   HR%   BB%   SO%
Projection              872 3.65 0.290 0.050 0.037 0.201
MLB 1st 3 years         522 3.34 0.282 0.043 0.036 0.248

Keiichi Yabu           Size  ERA   BH%   HR%   BB%   SO%
Projection             1030 4.30 0.284 0.041 0.076 0.149
MLB 1st 3 years         262 4.50 0.330 0.033 0.089 0.170

These last four were all primarily starting pitchers in Japan, but did most or all of their major league pitching out of the bullpen. All showed better-than-expected strikeout rates, with Uehara almost doubling his rate after the Orioles removed him from the rotation.

It is known that on average pitchers perform better out of the bullpen. Tango calls it his rule of 15: Home runs and walks down 15 percent, strikeouts up 15 percent. I believe I can improve the Japanese translation factors by adjusting the stats as starters and relievers to the same baseline before compiling sets of matched pairs. Where I have play-by-play data from Gameday I am able to tabulate how each pitcher has performed as a starter and as a reliever, which then needs to be regressed to the standard splits. However, the available seasonal level stats from Japan do not offer this breakdown. The number of innings pitched as a starter and reliever can be estimated, but the Japanese leagues have not published games started for the past three seasons.

The records for the eight starting pitchers above suggest that the translation factors currently being used by Oliver are too generous: As a group, the observed major league performances of the eight compared to their projections were 0.99 for base hits (BABIP), 1.11 for home runs, 1.24 for walks and 0.91 for strikeouts. But, how much more should we trust the record of eight starting pitchers in the majors compared to the 75 Japanese pitchers who have pitched in the minors and majors over the past 13 seasons? How much different should we expect them to be from the 185 pitchers who have left here for Japan?

Yu Darvish             Size  ERA   BH%   HR%   BB%   SO%
Projection             1799 2.57 0.280 0.019 0.058 0.272
Adjusted                         0.278 0.021 0.071 0.248

The first line is Darvish’s current Oliver projection, while the second shows the rate stats adjusted for those eight starters (still very good).

These are Darvish’s top comparables using his current projection—a higher ERA than 2.57, but the top five still puts him right at the top with Kershaw and Strasburg, while a larger sample of comps still rates high enough to rank him fifth of sixth in the major leagues.

Rank Name              Season  ERA   BH%   HR%   BB%   SO%
  1  Martinez, Pedro     2004 2.55 0.288 0.020 0.056 0.285
  2  Verlander, Justin   2012 2.87 0.281 0.033 0.064 0.263
  3  Johnson, Randy      2005 2.96 0.290 0.034 0.054 0.272
  4  Santana, Johan      2007 2.78 0.274 0.039 0.056 0.269
  5  Kershaw, Clayton    2012 2.75 0.284 0.024 0.078 0.274
  6  Prior, Mark         2003 3.19 0.302 0.032 0.073 0.278
  7  Schmidt, Jason      2004 2.97 0.283 0.028 0.074 0.247
  8  Peavy, Jake         2008 3.47 0.304 0.034 0.063 0.254
  9  Greinke, Zack       2010 3.20 0.307 0.029 0.058 0.253
 10  Lincecum, Tim       2012 3.27 0.300 0.030 0.084 0.268
 11  Schilling, Curt     2005 3.02 0.292 0.039 0.042 0.248
 12  Matsuzaka, Daisuke  2008 3.29 0.283 0.038 0.072 0.243
 13  Hamels, Cole        2008 3.52 0.290 0.043 0.070 0.246
 14  Bedard, Erik        2008 3.39 0.303 0.031 0.079 0.250

                        Top 5 2.78 0.283 0.030 0.062 0.273
                       Top 10 3.00 0.291 0.030 0.066 0.266
                          All 3.09 0.292 0.033 0.066 0.261

Now using the adjusted projection. The composite ERA of the top five comps again puts Darvish fifth or sixth, while the larger list drops him closer to 15th.

Rank Name              Season  ERA   BH%   HR%   BB%   SO%
  1  Schmidt, Jason      2004 2.97 0.283 0.028 0.074 0.247
  2  Martinez, Pedro     2006 3.01 0.281 0.032 0.059 0.243
  3  Matsuzaka, Daisuke  2008 3.29 0.283 0.038 0.072 0.243
  4  Verlander, Justin   2012 2.87 0.281 0.033 0.064 0.263
  5  Latos, Mat          2012 3.25 0.290 0.032 0.069 0.234
  6  Hanson, Tommy       2012 3.43 0.285 0.036 0.072 0.233
  7  Peavy, Jake         2010 3.40 0.298 0.034 0.076 0.243
  8  Lester, Jon         2011 3.34 0.298 0.029 0.083 0.241
  9  Hamels, Cole        2008 3.52 0.290 0.043 0.070 0.246
 10  Kennedy, Ian        2012 3.24 0.277 0.036 0.071 0.226
 11  Jimenez, Ubaldo     2012 3.49 0.295 0.024 0.091 0.240
 12  Scherzer, Max       2011 3.59 0.296 0.038 0.084 0.249
 13  Kershaw, Clayton    2012 2.75 0.284 0.024 0.078 0.274
 14  Bedard, Erik        2009 3.57 0.296 0.036 0.083 0.237
 15  Beckett, Josh       2005 3.50 0.303 0.031 0.077 0.238
 16  Santana, Johan      2009 3.37 0.286 0.043 0.059 0.235

                        Top 5 3.08 0.284 0.032 0.068 0.246
                       Top 10 3.23 0.287 0.034 0.071 0.242
                          All 3.29 0.289 0.033 0.074 0.243

For the final set of comparable projections, I used a defense independent approach, using only groundball, walk and strikeout rate. Assuming that major league baseball has a slightly lower rate of ground balls than the Nippon league, I found Darvish’s top comps using a ground ball rate of 0.55, a walk rate of 0.071, and a strikeout rate of 0.248. There’s no difference between the different sized groups, each with a composite ERA out of major league baseball’s top 15, but much of the ERA difference between this and the previous sets of comps is in the home run rate, almost 50 percent higher here than in Oliver’s projection.

Rank Name              Season  ERA  GB%   BH%   HR%   BB%   SO%
  1  Liriano, Francisco  2007 3.58 0.53	0.304 0.037 0.087 0.254
  2  Hernandez, Felix    2011 3.16 0.54	0.287 0.026 0.071 0.219
  3  Burnett, A.J.       2008 3.81 0.55	0.295 0.037 0.082 0.217
  4  Jimenez, Ubaldo     2011 3.18 0.52	0.284 0.020 0.097 0.240
  5  Lester, Jon         2011 3.34 0.51	0.298 0.029 0.083 0.241
  6  Wainwright, Adam    2011 3.12 0.51	0.295 0.028 0.061 0.226
  7  Garcia, Jaime       2012 3.64 0.54	0.310 0.027 0.069 0.201
  8  Carpenter, Chris    2006 3.27 0.54	0.292 0.031 0.052 0.205
  9  Zambrano, Carlos    2006 3.23 0.51	0.276 0.023 0.088 0.215
 10  Chacin, Jhoulys     2012 3.61 0.52	0.271 0.033 0.105 0.213
 11  Halladay, Roy       2012 2.96 0.52	0.305 0.024 0.034 0.216
 12  Wilson, C.J.        2012 3.47 0.51	0.290 0.024 0.089 0.212
								
                        Top 5 3.41 0.53 0.294 0.030 0.084 0.234
                       Top 10 3.39 0.53	0.291 0.029 0.079 0.223
                          All 3.36 0.53	0.292 0.028 0.076 0.221

Yu Darvish is clearly a very talented pitcher, enough that the Texas Rangers were willing to put $51 million down and $60 million over the next six years to have him in their starting rotation. Just how well his future major league performances can be projected is a work of art, with different available methods where even small changes in estimated base hits allowed can vary the ERA estimate by a few tenths. Oliver has had a good record so far, such as with Stephen Strasburg and Ian Kennedy. However, players have some amount of natural variance each year as well as changes in their true talent.

Examining several sets of comparable pitchers shows an expected ERA for Darvish anywhere from 2.78 to 3.40, which is from excellent down to merely very good, but no recent major league pitchers have the combination of Darvish’s expected home runs, walks and strikeouts. Looking at those comparables and Darvish’s pitch metrics give me a personal opinion: I would compare him to Felix Hernandez with more strikeouts or Ubaldo Jimenez with fewer walks.

Meanwhile, as these customized estimates all gave a higher ERA projection than Oliver, I’ll retreat to my office, where first things on the drawing board are incorporating ground ball rates to give regression means for base hit and home run rates, and separately consider pitching as a starter and reliever.

Print Friendly
 Share on Facebook0Tweet about this on Twitter1Share on Google+1Share on Reddit0Email this to someone
« Previous: Reflections after a long offseason
Next: Are you mocking me? »

Comments

  1. Kyle Boddy said...

    Great article, Brian! However, Kei did not return to Japan. He still pitched in the Yankees org and is now a FA looking for work. He reportedly does not want to go back to the NPB and wants to play MLB.

  2. Rob H. said...

    Yeah, it’s the rule of 17 and the walk rate doesn’t change – only BABIP, K/PA, and HR/PA.

    Really interesting article. Looking forward to the Oliver updates.

  3. Brian Cartwright said...

    Sorry, should have googled for Tango’s rule. 15 was in my head because that’s what my own research found, but I didn’t find as much effect on BABIP, which was -4% for relievers, HR -15%, BB +2%, SO +14%

  4. George Tummolo said...

    The problem with Darvish is not his physical skills, but between his ears. One has to live in Japan to see just how entitled the players feel about themselves. Darvish is nowhere near Matsuzaka’s arrogance level, and he seems to be much smarter, but that really isn’t saying that much. He also gets away with a lot of mistake pitches that MLB players should tee off on. If Darvish really takes coaching well, he could be great, sure. The Rangers are a really good fit for him, especially with Nolan Ryan around.

  5. Brian Cartwright said...

    If Darvish’s rate of getting away with mistake pitches is the same as other pitchers in Japan, then it’s part of the translation factors already. The problem in projecting is when a pitcher does something like that consistently differently than others.

  6. Greg Simons said...

    It would be very impressive if Darvish can produce a 0.4 HR/9 in Arlington.  His season ought to be fascinating to follow.

    I love seeing that my NL-only fantasy team contains four of the top 15 projected MLB ERA leaders (Strasburg, J. Johnson, Latos and Wainwright), all for a total of less than $20.  I sure hope those numbers come to pass.

  7. Brian Oakchunas said...

    Brian,

    Thanks for the great article and groundball info which has been impossible to find.

    Here is an issue I see with the numbers, though. Your BB% and SO% for the Darvish projection in the middle of the page seem to be based on a different number of total batters faced when compared against the projection at the top of the page.

    In other words you have him listed at 198 K’s and a 9.6 K/9, which would suggest around 729 batters faced to come up with a SO% of .272. However, the 41 BBs and 2.0 BB/9 rate suggest that it is based on a total batter count of about 708 batters to get the BB% of .058. Is this just an odd quirk of Oliver in the way that the percentages are calculated or are they supposed to be based on the same number of total batters and therefore incorrect?

  8. Brian Cartwright said...

    The projected BB% and SO% will differ from the raw original data, as they’ve been adjusted for park and league. I then compared the translated data to actual MLB performance to check the accuracy of the translation.

    Darvish’s unadjusted rate stats

    Year HR/BC BB/PA SO/PA HR/9 BB/9 SO/9
    2007 .017 .061 .266 0.4 2.1 9.1
    2008 .022 .058 .272 0.5 2.0 9.3
    2009 .019 .064 .238 0.4 2.2 8.3
    2010 .010 .058 .276 0.2 2.1 9.9
    2011 .009 .041 .312 0.2 1.4 10.7
    </pre>

    NPB adopted a new ball standard in 2011, which droped the HR% to 64% of previous, so the .009 HR/BC in 2011 is equivalent to .014 in the other seasons.

    When I do all the projections, I have a set of 260 pitchers who have pitched in Japan and the US from 1998-2011. The factors are calculated so that the average error is zero, where all the plus errors cancel out the minus errors. Regression helps reduce the total error (does not care whether high or low) by bringing all the projections closer to the center, thus reducing the outliers.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *