Pitch classification revisited
by Max MarchiJuly 30, 2010
I haven't published any article for some time (did you notice?), but I have been able to fine-tune the classification algorithm that I first introduced in the article Rider, slurve and... Titanic.
I have been able to classify all the pitchers' pitches and, after performing comparisons with BIS classifications (thanks to FanGraphs), MLBAM's and C. Sven Jenkins's notes at 60 ft 6 in, I consider myself quite satisfied with my work.
I'm not giving out the details of the work behind the scenes right now. In fact I'm planning to show them at the next PITCH/f/x Summit at the end of August (and then write some articles here at THT). However, I promised in the comments a couple of months ago that, as soon as I felt comfortable with my classification, I would release the pitchers' repertoires according to my method. Thus you will find a spreadsheet at the end of the article.
Hey, wait! Don't skip immediately to the spreadsheet, please.
First, let's see what's new since last time I reported on the subject. I classified all the pitches in the PITCHf/x database. That means that—yes—I've been able to tame the lefties, who were initially so elusive to my algorithm.
The final number of clusters is 17, up from 14; thus I needed to give some new names to a few pitches.
Here they are (all 17), with their average values of speed and movement. I'm still open to suggestions for the vernacular.
type speed h.mov v.mov
heater 94.3 -8.3 7.4
jumping fastball 93.0 -5.0 9.9
sinker 90.3 -9.5 2.8
rider 89.9 -8.4 6.6
rising fastball 89.2 -4.8 9.7
cutter 88.8 -0.4 6.6
low-arm fastball 87.9 -9.2 -3.6
hard slider 85.4 1.6 3.0
power change 84.0 -7.4 3.2
riding change 83.6 -6.0 6.3
sharp slider 82.8 2.3 0.9
straight change 79.7 -5.9 5.9
low-arm offspeed 79.3 -7.5 -5.3
tight curve 78.9 4.6 -4.9
slurve 78.8 5.0 0.8
roundhouse curve 74.3 5.9 -5.9
floater 69.8 2.3 4.0
I'll briefly outline the differences since the previous version.
- The pitches coming from low angles are now split into two groups, fastballs and offspeed, and I'm fine with it.
- There are three change-ups, up from two. One is a straight slow ball (Tim Wakefield's fastball and emergency pitchers' offerings fall into this bucket, together with some regular changes). The other two are separated by a couple of miles and a few inches of movement (the one that travels faster stays up more, the other has a bigger tail on the throwing side).
- Finally, we have a second slider. I dubbed "hard" the one with more velocity and "sharp" the one with more horizontal movement (should I say bite?).
I'll say it again: I'm open to suggestions for improving the labeling.
One of the reasons leading me to undertake this task was that I suspected some hitters could be very effective against one type of curve (or change, or slider) and nearly helpless against a different type.
Let's see if my reasoning holds ground.
The following hitters performed a lot better against tight curves than against roundhouse curves (data from 2008 and 2009 combined, minimum 40 pitches of each kind faced).
player RV difference 1 Glaus Troy 0.108 2 Coghlan Chris 0.077 3 Berkman Lance 0.072 4 Cedeno Ronny 0.068 5 Gwynn Tony 0.068 6 Tracy Chad 0.055 7 Hairston Scott 0.054 8 Davis Chris 0.053 9 Rodriguez Ivan 0.050 10 Tatis Fernando 0.050
These other players behaved in the opposite way, having more success against the curveballs of the slowest type.
player RV difference 1 Rolen Scott -0.108 2 Wells Vernon -0.102 3 Millar Kevin -0.091 4 Davis Rajai -0.083 5 Escobar Yunel -0.076 6 Jeter Derek -0.072 7 Cuddyer Michael -0.072 8 Delgado Carlos -0.070 9 Aybar Erick -0.069 10 Dickerson Chris -0.068
The following histogram shows the distribution of MLB players.

If having different success against the two types of curveballs were a repeatable skill, we would expect to find the same players in the above lists year in and year out. Looking at the following scatter plot, we see no correlation between "favorite type of curve" in 2008 and 2009.

I found similar results comparing change-ups and sliders (see small charts below).

Looking at two years of data, it seems there aren't players who constantly crush a slow curve and are helpless against a tight one; similarly no differences appear for change-ups and sliders.
I need to note that I used crude run values, unadjusted either for pitch count or pitcher. This means that a hitter might have faced many tight curves by great pitchers on 0-2 counts in 2008 and a lot of them by replacement hurlers on 2-1 counts in 2009. (Note: if you are not sure what I'm talking about, I suggest this read.)
Adjustments might give us different results, but I feel if something was going on we would have seen some indication of correlation in the charts. Maybe in another year and a half, when we have four full seasons of PITCHf/x data (and thus more pitches faced of each kind) we might be able to see something.
Anyway I feel this is a non-issue, and I'm sure I'll find more interesting uses for my classification algorithm.
The spreadsheet.
Here is the link to the promised spreadsheet.
Things to note.
As I said at the beginning of the article, I checked my classifications against BIS' (from Fangraphs), MLBAM's and C. Sven Jenkins'. I did the comparisons on an individual basis for a couple of dozen pitchers—I'll show a few of them in future articles. While I'm really satisfied with what I have seen on those pitchers (and I tried to sneak in those pitchers I thought would put the system to a real test), many eyes will surely help me find where my system fails.
The spreadsheet is based on 2009 data. Pitchers with limited pitches thrown in that season might show strange results—that's the case with Brandon Webb, who toed the rubber for just four innings in 2009. Issues like this will be addressed in future refinements of the system.
References and Resources
Pitch classification by the author was compared with:
MLBAM's;
BIS' (from FanGraphs);
C. Sven Jenkins' (60 ft 6 in).
After creating a baseball rendition of The Beatles' Sgt. Pepper cover, Max began his baseball writing because he needed an excuse to show the picture. He wrote for an Italian audience for six years before making the jump to The Hardball Times. You can contact him by e-mail.






 
Max - Interesting article. Is the table for the pitch speeds and movement by pitch classification only for right handers with left handers having mirror image horizontal movement for the same classification?
It would be helpful if you included MLB number and handedness for the pitchers in your spreadsheet.