Changing repertoires

In case you are wondering, I’m still working on pitch classifications. A quick recap, since I have been writing without regularity for a long period:

I created an algorithm that classifies all the pitches in the PITCHf/x database in 17 groups (here are the two articles on the subject: Rider, slurve and… Titanic; Pitch classification revisited). One of my goals is to reduce to a minimum the necessity to perform manual/visual ad hoc classifications when writing an article.

You may have noticed I said “reduce to a minimum” and not “remove”; if you don’t want to make the internet cry a little when you write an article, you have to check your classifications.

It seems that this year a lot of PITCHf/x articles have been about somebody adding a new pitch to his repertoire, so I decided to look at whether my classification is sensitive to those additions. (By the way I have to find a name for my 17-bucket pitch classification algorithm, otherwise I will keep calling it “my 17-bucket pitch classification algorithm.” While I try to come up with a fancy acronym, I’ll go with an ego-boosting MAX-17).

Cole Hamels

Harry Pavlidis wrote a couple of THT articles about Cole Hamels tinkering with his repertoire. In the first one, Harry showed how Hamels tried to add a sinker and a cutter to his three-weapon arsenal; in the second how the pitcher not only was unsuccessful with the addition, but also seemed to lose confidence in his established fastball-change-up-curveball combo.

MAX-17 (Okay, I said it) usually does a good job separating four-seamers and sinkers, but with Hamels it gets all over the place; should I base the evaluation of my classification algorithm on the Philly lefty, I would immediately stop working on the project. Thus, I’m sure you believe my word that MAX-17 completely fails on Hamels and I can move forward.

Roy Halladay

Dave Allen at FanGraphs talked about another great Philly pitcher who was not content with his success and went on to add a pitch to his mix. Doc Halladay’s results indicate the addition has not hurt. Some discussion followed on whether the new pitch is a splitter (as the author suggests) or a forkball (as proposed by Matt Lentzner in the comments section).
The pitch type table at FanGraphs shows an increased use of the change-up (from 5 to 11 percent). MAX-17 notices the very same increase, but also identifies the 2010 change as a different animal than the 2009 edition; the older is labeled as a riding change, the newer as a power change (They have pretty similar characteristics, except the latter has less vertical movement.

As noted in the introductory article, my method and the consequent labeling consider only the behavior of the ball from the rubber to the plate—no mention is made of grips, hence no “two-seamer” nor “splitter” in the vernacular. It follows that I won’t enter the split-finger/forkball debate, and I’ll just note a hit for MAX-17 this time.

Edwin Jackson

R.J. Anderson, again at FanGraphs, hypothesized White Sox pitching coach Dan Cooper might have taught Edwin Jackson a cutter (as he has done with other pitchers) as soon as he arrived in Chicago. The Hamels case suggested me that MAX-17, in its current form, has trouble when a pitcher makes a midseason repertoire transition; I started to try fixing this issue and tested what could become the next version of my algorithm on Jackson.

No cutters came out of the automated classification, but I found something interesting: two different sliders are listed in Jackson’s 2010 repertoire, his usual hard slider and, used in a smaller percentage, a new sharp slider (slower, with more horizontal movement). Is that the fruit of Cooper’s coaching (maybe a cutter identified, but wrongly labeled by the algorithm)? I looked at the game-by-game pitch mix and I actually found a difference from his D-Backs days to his Sox days. Trouble is, the “new pitch” was “used” in Arizona and “discarded” in Chicago.

All the quotation marks in the previous sentence are to be taken as a warning: I haven’t done an ad hoc visual analysis of the pitches, so I can’t say for sure there has actually been a new pitch for a few months in Jackson’s arsenal. (Park or camera calibration effects are more likely). And I don’t want to make the web weep.

Brad Penny

Dave Cameron (FanGraphs, one more time) talked about Brad Penny’s splitter, a 2010 novelty. He showed how MLBAM classifies it as a change-up; however, video scouts at BIS already had picked up the new weapon, correctly identifying it as a splitter (see FanGraphs pitch type table; actually it has been crediting Penny with a splitter for three years).

As in Halladay’s case, MAX-17, while not built to identify how a pitcher holds the ball, notices the metamorphosis of Penny’s change of pace: It is identified as a riding change in 2009 (used with a frequency of 8 percent), and as a power change (thrown 31 percent of the time).

Homer Bailey

This was way back in 2009, and not from the statheads community. At, Matt O’Donnell wrote an article about Homer Bailey returning to the Reds rotation with a new arrow in his quiver. Again, we are talking about a splitter. Again MAX-17 recognizes the transition (from a riding change to a power change). Again it has troubles with midseason adjustments (it doesn’t get the new weapon, developed during the 2009 season, until the following year).

The verdict

Among the many reasons that make automated pitch classification a difficult task is the fact that pitchers do not use the same set of pitches for their entire career, sometimes not even for a single season. Testing MAX-17 on a handful of pitchers who have (or might have, see Jackson) changed their repertoire shows mixed results.

The algorithm seems to pick up metamorphoses that occurred in the offseason, while it has trouble when a pitcher tinkers with his repertoire during the summer. (It has to be noted that this article focused unintentionally mostly on splitters).
This finding suggests me a few ways to refine the classification method.

Once more, everybody (myself included) keep in mind Mike Fast’s warnings on his terrific article: When writing an article on a single pitcher, always perform your ad-hoc pitch classification. However, I hope MAX-17 will become good enough to be used for league-wide analyses, those where an author can’t possibly classify by eye/hand every pitcher’s pitch.

  1. Lucas Apostoleris said...

    Max, this is really interesting.  Thank you for posting.
    I’m pretty sure you alluded to this, but does your algorithm have a park adjustment?  In the case of Jackson, I think that pitching at US Cellular in the second half was what caused the different look to his slider.  If I’m not mistaken, there’s a pretty significant difference between the pfx_z values from Chase Field to US Cellular.

