March 20, 2010

Fangraphs Player Search:

The "World Champ of Baseball Annuals"




And here's the full roster.



Or you can search by:

Sports Tickets

Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets.
Find premium Chicago Cubs tickets and other Chicago tickets at JustGreatTickets.com.
Chicago Cubs Tickets
Chicago Tickets
Championship Tickets





Creative Commons License
All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.

Workload and Durability (Part 1)

by Robert Dudek
November 30, 2004

Much has been written on the subject of pitch counts. In some quarters, the notion that high pitch counts are dangerous to a pitcher's health is an article of faith; the idea makes intuitive sense. There is just one problem -- a lack of evidence in its favor.

Earlier this year, Rob Neyer and Bill James published the exceptional Neyer/James Guide to Pitchers -- an encyclopedia of the pitching repertoire of nearly every significant pitcher in Major League and Negro League history. In one essay, titled "Abuse and Durability" (pp. 449-463), James runs a series of matched-pair studies, identifying the most similar non-abused pitchers to pitchers listed as "abused" in various editions of Baseball Prospectus based on the Pitcher Abuse Points (PAP) system devised by Rany Jazayerli and Keith Woolner. The results skew in one direction: the "abused" pitchers keep more of their value (on average) than comparable "non-abused" pitchers. That's right -- keep their value.

James concludes his essay speculating about what is behind the phenomenon:

Most injuries to pitchers are not the result of chronic overuse; some are, particularly to young pitchers, but most are not. They're catastrophic events, just like a heart attack or a torn muscle. They happen suddenly, and they happen when a pitcher goes outside the envelope of his previous conditioning.

Backing away from the pitcher's limits too far doesn't make a pitcher less vulnerable; it makes him more vulnerable. And pushing the envelope, while it may lead to a catastrophic event, is more likely to enhance the pitcher's durability than to destroy it.

And yet, questions linger. James himself notes that since power pitchers last longer (and tend to throw more pitches per inning) than finesse types, controlling for quality of pitcher isn't sufficient to isolate the effect of high pitch counts. In addressing the issue of pitch count we must be sensitive to differences of pitcher type.

The quality of a matched-pair study depends on how similar your comparison groups are in all respects save for the one under study. On the other hand, pegging the similarity standard too high may lead to too few matches to tell us anything useful. A balance must be struck between sample size and degree of similarity.

Matched-Pair Workload Study #1

Starting with a large pool of players from which to match leads to more good matches. To that end, I settled on a pool of starting pitchers born after 1945 and before 1970. This 24-year period encompasses the baby boom and immediate post-boom generations. All but a handful of pitchers born before 1970 are either retired or no longer starting in the majors, so we don't need to worry very much about incomplete data.

To start we need to define heavy and moderate workloads for starting pitchers. A heavy workload was defined as exceeding 3,800 estimated pitches(1) in a given year; 3,000 to 3,600 estimated pitches was defined as a moderate workload. Because of the power of the pitch count and the pervasiveness of the five-man rotation, very few pitchers have exceeded 3,800 pitches in recent years (starting 34 times, a pitcher would need to average almost 112 pitches a start).

Group A pitchers were those who had at least one heavy workload season before age 28. Group B pitchers were those who never exceeded 3,600 estimated pitches in a year before age 28. Matches were based on highest similarity score, using single season to single season comparisons, and taking into account the following characteristics:

(1) Strikeouts per Opportunity [K/(BF-IW]
(2) Non-Intentional Walks per Opportunity [(W-IW)/(BF-HBP-IW)]
(3) Earned Run Average [ER/IP*9]
(4) Year of Birth
(5) Age on July 1st(2)
(6) The matched pitchers must throw with the same hand

Here's a hypothetical example of how similarity scores work in this study. Imagine two pitchers with identical ERAs, strikeout rates and walk rates. These pitchers are the same age (to the day) and are born in the same year. The Group A pitcher, however, throws 750 estimated pitches more than the Group B pitcher. The method considers this a perfect match -- earning 1,000 points. In actual cases, the differences in each category result in points deducted from 1,000; the higher the final similarity score, the greater the (statistical) similarity between the two pitchers.

The final requirement was that no Group B pitcher could be matched with more than one Group A pitcher; the match with the higher similarity score was given priority. Each matched season was designated Year Zero for that particular pitcher. A more detailed description of the comparison method(3) can be found in the footnotes.

Quality Control

Before we turn to the matched pairs, let's consider what James calls "quality leakage." James noted that in matched pair studies, there is a tendency for very good pitchers to be matched with lesser pitchers because the former are usually unique. James' solution was to select pitchers for his "Group B" that were of slightly higher quality (more Win Shares) than his "Group A" pitchers so as to offset the leakage. I took a different approach: I disposed of the worst third (according to similarity score) of the matched pairs.

Of the 69 matched pairs, the 23 least similar pairs were removed from consideration. I believe this is sufficient to alleviate the worst effects of the quality leakage problem, while maintaining a sufficiently large sample. To illustrate, the worst "match" among the original 69 pairs was Nolan Ryan/David Cone. Ryan is nearly a generation older than Cone and walked and struck out batters at a greater rate as a young pitcher. Because they are so dissimilar, there is no reason to think that the Ryan/Cone match tells us anything about durability.


Unmatched Group A pitchers
Vida Blue ('71)Ted Higuera ('86)John Montefusco ('75)
Bert Blyleven ('73)Catfish Hunter ('72)Mike Mussina ('96)
Jim Clancy ('80)Randy Jones ('76)Gary Nolan ('70)
Joe Coleman ('74)Clay Kirby ('71)J.R. Richard ('76)
Ron Darling ('85)Mark Langston ('87)Nolan Ryan ('74)
Larry Dierker ('69)Bill Lee ('73)Frank Tanana ('76)
Dwight Gooden ('85)Dennis Leonard ('77)Fernando Valenzuela ('82)
Ron Guidry ('78)Jon Matlack ('74)

The "cast-offs" were pooled to create a new group (Group C); I'll consider them in Part 2 of this series. A few Hall of Fame-type pitchers from Group A made it into the study, most notably Roger Clemens and Greg Maddux. Should we exclude them as well? Arbitrarily removing "special arms" seems like a sensible approach, but it creates its own problems (which I will also consider in Part 2). Hand-picking which pairs stayed and which went was not the path I wanted to go down.

Without further ado, the 92 subjects of Study #1 are:

Group A PitcherSim.Group B Pitcher---Group A PitcherSim.Group B Pitcher
Len Barker('80)929Jose Guzman('88)  D.Lemanczyk('77)959Bart Johnson('76)
Bill Bonham('74)936Ken Forsch('73)  Greg Maddux('91)959Andy Benes('92)
Oil Can Boyd('85)954John Burkett('90)  Dennis Martinez('79)955Bill Gullickson('83)
Tom Bradley('71)927Reggie Cleveland('72)  Jack McDowell('92)957S.Bankhead('89)
Kevin Brown('92)955Pedro Astacio('96)  Doc Medich('74)967Bob Moose('73)
Tom Browning('85)972Jamie Moyer('88)  Mike Moore('86)984Andy Hawkins('86)
Ron Bryant('73)956John Curtis('73)  Jack Morris('82)966Eric Show('83)
Steve Busby('74)933Gary Gentry('69)  Mike Norris('80)925Orel Hershiser('85)
Roger Clemens('87)966Erik Hanson('90)  Melido Perez('92)962Pete Harnisch('93)
Jim Colborn('73)928Dave Frost ('79)  Dan Petry('83)953Jay Tibbs ('85)
Joe Decker('74)938Buzz Capra('74)  Rick Reuschel('74)949Rick Langford('77)
D.Eckersley('78)971Scott Sanderson('80)  Jerry Reuss('73)944Bob Shirley('77)
Cal Eldred('93)953Ben McDonald('92)  Steve Rogers('77)942Burt Hooten('77)
R.Erickson('78)950Mark Lemongello('78)  Bret Saberhagen('88)929Frank Castillo('92)
Alex Fernandez('96)956Tommy Greene('93)  Jim Slaton('76)953Bob Forsch('75)
Ed Figueroa('76)935Alan Foster('73)  John Smoltz('93)936Kevin Appier('95)
Mike Flanagan('78)942Bob Ojeda('84)  Mario Soto('83)953Tim Belcher('89)
W.Garland('77)962Doyle Alexander('77)  Paul Splittorf('73)941John Candelaria('80)
Ross Grimsley('74)935Ken Brett ('73)  Dave Stieb('83)956Charlie Lea ('83)
Mark Gubicza('88)967Ken Hill('92)  Rick Sutcliffe('83)953Dave Stewart('84)
Ed Halicki('77)953Pete Vuckovich('79)  Dick Tidrow('73)946Glenn Abbott('77)
Pat Hentgen('96)964Ramon Martinez('95)  Frank Viola('86)948Britt Burns('85)
Jim Hughes('75)938Dave Freisleben('74)  Mike Witt('86)936Jose Rijo('91)

The weighted average performance of the Group A pitchers was 17 wins, 13 losses, 3.52 ERA, 15.0% strikeout rate, 7.3% walk rate, 268.0 IP, and 4,038 estimated pitches.

The weighted average performance of the Group B pitchers was 13 wins, 11 losses, 3.54 ERA, 15.0% strikeout rate, 7.5% walk rate, 216.7 IP, and 3,268 estimated pitches.

The only significant statistical differences between the two groups in Year Zero are those related to workload. Aha, you might say -- that's only one season. Could the Group B pitchers be (in truth) inferior and their Year Zero performance merely a result of a preponderance of career years? Could there be differences in performance in the years leading up to the seasons in question? The numbers for the average Group A and Group B pitcher for the three years up to and including Year Zero ...

Year -2 to Year Zero
 IPPitchesERAK rateW rateWinsLosses
Group A average594.790083.5415.37.63630
Group B average452.768333.5615.27.42724

... tell the same tale. Apart from workload indicators, the two groups appear to be a very good match.

Suppose you are the general manager of a baseball team and are considering acquiring one of two pitchers: a 25-year-old pitcher who threw 3,900 pitches in 2004 and a very similar pitcher who threw only 3,300. Your scouts don't turn up any major differences between the two and their overall performance over the last three years has also been very similar. The one difference is that the first pitcher has been subjected to a significantly greater workload than the second pitcher. Who would you choose and why?

Is surviving the heavy workload a marker of greater durability, or instead does the greater "mileage" mean you'd be better off acquiring the "underused" pitcher? The answer ... next week.

References and Resources
(1) Pitches thrown were estimated using the Extended Pitch Count Estimator developed by Tangotiger.

(2)Age was calculated using exact date of birth as of July 1st of the year in question.

(3)Similarity Scores were determined by dividing the assigned weight for each category by the standard error based on the population of 3000+ pitch seasons in the pool. The weights for each category were as follows: strikeout rate= 40 points; ERA= 40 points; Age= 30 points; birth year= 30 points; walk rate= 20 points; estimated pitches=20 points; Total= 180. For all categories (except estimated pitches thrown) the absolute difference between the two pitchers was multiplied by the assigned weight and divided by the standard error. For estimated pitches, the absolute difference from a difference of 750 pitches was multiplied by the assigned weight and divided by the standard error.

Sample Calculation (Figures in blue = standard error)

Pat Hentgen (1996), born 1968: 16.1% K rate, 8.3% W rate, 3.22 ERA, 27.63 age, 4,012 estimated pitches
Ramon Martinez (1995), born 1968: 16.2% K rate, 9.0% W rate, 3.66 ERA, 27.28 age, 3,150 estimated pitches

Strikeout Points: abs(.161-.162)*40/.0400 = 1.00Walk Points: abs(.083-.090)*20/.0211 = 6.64
ERA Points: abs(3.22-3.66)*40/1.026 = 17.15Age Points: abs(27.63-27.28)*30/1.814 = 5.79
Year of Birth Points: abs(1968-1968)*30/6.99 = 0.00
Estimated Pitches Points: (abs(750-abs(4012-3150)))*20/354.6 = 6.32

Sum of Deductions: 1.00 + 6.64 + 17.15 + 5.79 + 0.00 + 6.32 = 36.90

Similarity Score = 1000 - 36.90 = 963.10 (rounded off to 963**)

** Due to rounding errors in the above calculations, the correct similarity score was not 963, but rather 964 (as noted in the main text)

Robert Dudek is also a Batter's Box author and can be contacted via e-mail.



Commenting is not available in this weblog entry.

Do you have a general question or comment for one of THT's writers? Send it in to our weekly mailbag We also welcome unsolicited op-ed pieces of approximately 500 words for consideration. We reserve the right to edit for length, clarity and consistency of style. Please include your whole name and location to be considered. If you have a comment about this specific article, please email the writer.



The best online source for major league baseball tickets is Ticket City.

     Next Article:  Around the Majors: Rumors>> <<Previous Article:  Around the Majors: Free agent signings