Searching for the game’s best pitch

Is it Johan Santana‘s change-up or Jon Papelbon‘s four-seamer?
Maybe Fausto Carmona‘s wicked sinker or Mariano Rivera‘s mythical cutter? Smoltz’s
slider is right there and Josh Beckett‘s curveball has to be in the
discussion, right? So, who has the very best pitch in the game?

Does this man throw the game’s best pitch?

Well, I’m kind of a numbers guy, so when thinking about this kind of question, I
start to wonder if you can add anything to the conversation with a
little analysis. Is there any sensible way to rate the best pitches in the game
using statistical analysis? They key word here is
“sensible” and we won’t know if any ranking is sensible until we try,
so let’s have a go.

The value of a single pitch

What we need to do here is to figure out how much any single pitch is
worth. Now, when a ball is put into play, we have a pretty good idea
of the value of the result. And when I say value, I am talking
about value in runs. The stat I’m going to use here is batting
runs
, which was developed by Pete Palmer before some of you were
born. By the way, batting runs is also known as linear
weights
, which has to be the worst name given to any baseball stat
in history.

Anyway, batting runs measures a player’s run production above that of
the average batter, by assigning a run value to the various outcomes
of a plate appearance—for example, a single is worth (on
average) just under half a run, the value of a walk is around
one-third of a run, an out is worth around negative .25 runs, and so
on. If you want to know about where these values come from (these are the
infamous linear weights), a good reference is href="http://www.amazon.com/Curve-Ball-Baseball-Statistics-Chance/dp/038700193X/ref=pd_bbs_sr_2?ie=UTF8&s=books&qid=1202722762&sr=8-2"
target="new">Curve Ball by Jim Albert and Jay Bennett.

Okay, so we’re good with balls put into play, but what about balls and
strikes, what are they worth? Only about 20 percent of pitched balls are
actually put into play, so we’d better figure out the value of the
other 80 percent of pitches thrown. Well, we can assign a run value to a
ball or strike; in fact, it’s something I already worked through for
my article on platoon effects in the Hardball
Times Baseball Annual 2008
, which you should go read now if you
haven’t already done so.

Here’s how I figure out the value of a ball not put into play. The
main point to keep in mind is that a ball will move the count in favor of
the batter and a strike will move it in favor of the pitcher. We can
figure out how much that is worth by examining how well batters hit
after reaching any given count. Let’s work through an example, which
is often the best way to understand something.

Let’s say an average batter steps to the plate, with the count
(obviously) zero balls, zero strikes. In 2007, the average batter hit
.268/.336/.423. Let’s highlight that:

0-0 count: .268/.336/.423

Okay, now let’s say the pitcher throws a first-pitch ball, bringing
the count to 1-0. After reaching a count of 1-0, batters, naturally,
hit better than average. Here is the line:

1-0 count: .282/.394/.459

That first ball is worth quite a bit to the batter. It turns Jhonny
Peralta
into Grady Sizemore.

What if that first pitch had
been a strike instead of a ball? Well, we look at how batters fared
after falling behind, 0-1:

0-1 count: .238/.282/.362

That’s a pretty big drop-off, Jhonny Peralta has turned into Tony
Pena
.

To get runs into
the discussion, let’s look at the batting runs values for these counts,
instead of AVG/OBP/SLG that I’ve shown above. Actually, here are the
batting runs for each of the 12 possible ball-strike counts:

Table 1 - Run Value of Any Given Count
+-------+-------------+
| Count | BattingRuns |
+-------+-------------+
| 0-0   |       0.000 |
| 1-0   |       0.038 |
| 2-0   |       0.104 |
| 3-0   |       0.220 |
| 0-1   |      -0.044 |
| 1-1   |      -0.015 |
| 2-1   |       0.037 |
| 3-1   |       0.142 |
| 0-2   |      -0.106 |
| 1-2   |      -0.082 |
| 2-2   |      -0.039 |
| 3-2   |       0.059 |
+-------+-------------+

So, going back to our example above, a first-pitch strike is worth
-0.044 runs (to the batter, of course), while a ball on the first
pitch on average is worth 0.038 runs. A quick look at the above
numbers will show you that the value of a ball or strike will be
different for different counts. The following table shows how much a
ball or strike is worth in any given count:

Table 2 - Run Values of Balls and Strikes
Count  Ball     Strike
0-0    0.038    -0.044
1-0    0.066    -0.053
2-0    0.116    -0.067
3-0    0.110    -0.078
0-1    0.029    -0.062
1-1    0.052    -0.067
2-1    0.105    -0.076
3-1    0.188    -0.083
0-2    0.024    -0.184
1-2    0.043    -0.208
2-2    0.098    -0.251
3-2    0.271    -0.349

Not surprisingly, the highest leverage occurs on the 3-2 count, where
a ball results in a walk and a strike results in a strikeout.

We are almost ready to start searching for the best pitch in baseball,
but I need to return for a moment to balls in play. We know how much
each ball in play is worth, as discussed above, but those values are
relative to the average batter, or, in other words, a batter with an
0-0 count.

If a batter singles with the count 0-2, the value of the
single is greater than the usual 0.47 runs, because the run value of
the 0-2 count was already at -.106 runs, as seen in Table 1,
above. The value of the single in this case is around .58 runs, or the
final run value (.47) minus the initial run value due to the count
(-.106). All balls in play will be evaluated this way: the value of
the plate appearance according to batting runs minus the value of the
count when the ball was put in play.

Two good pitchers, six good pitches

Okay, let’s have a look at a couple of pitchers to get a feel for all this.
Let’s start with last year’s NL Cy Young winner, Jake Peavy. The
table below shows the values of Peavy’s three pitches, expressed
in terms of runs above average per 100 pitches. Negative values means
the pitcher gave up fewer runs than average.

Run values for Jake Peavy's pitches
                     +------------------+-----------------+---------+
                     |    Not In Play   |     In Play     |  Total  |
+------------+-------+--------+---------+-------+---------+---------+
| Name       | Pitch | NP_nip | runs100 | NP_ip | runs100 | runs100 |
+------------+-------+--------+---------+-------+---------+---------+
| Peavy_Jake | FB    |   1257 |    -1.3 |   224 |    -3.3 |    -1.6 |
| Peavy_Jake | SL    |    555 |    -1.1 |   139 |     1.6 |    -0.6 |
| Peavy_Jake | CB    |    390 |    -3.3 |    80 |     1.9 |    -2.4 |
+------------+-------+--------+---------+-------+---------+---------+
Notation:
NP - number of pitches
runs100 - runs per 100 pitches
nip - not-in-play
ip - in-play
tot - all pitches

As you can see, I’ve broken out the results for not-in-play and
in-play pitches. When batters don’t put the ball in play, it appears
that Peavy’s best pitch is his curve. On the other hand, when the
ball is put into play (and these include home runs), his fastball is
his most effective pitch. Overall, as we shall see, Peavy’s fastball
and curve are two of the better pitches I’ve analyzed with the
pitch-f/x data.

Here’s another:

Run values for Johan Santana's pitches
                        +------------------+-----------------+---------+
                        |    Not In Play   |     In Play     |  Total  |
+---------------+-------+--------+---------+-------+---------+---------+
| Name          | Pitch | NP_nip | runs100 | NP_ip | runs100 | runs100 |
+---------------+-------+--------+---------+-------+---------+---------+
| Santana_Johan | FB    |    534 |    -1.7 |    81 |     4.4 |    -0.9 |
| Santana_Johan | CU    |    241 |    -4.8 |    47 |    11.6 |    -2.1 |
| Santana_Johan | SL    |    100 |    -1.1 |    21 |    -0.6 |    -1.0 |
+---------------+-------+--------+---------+-------+---------+---------+

Santana’s change-up appears to be his most effective pitch, as we might
have expected, but he was better than average with the fastball and
slider, as well.

Now that we’ve gotten a feel for the run values of a pitch, let’s
go looking for the best pitches in the game.

Fastballs

Who has the game’s best fastball? That’s surely open to debate, but
what I can do here is show whose fastball was the most effective in 2007.

My pitch classification scheme doesn’t distinguish among four-seamers,
cutters or sinkers—all those pitches are considered “fastballs.”
I also include only pitchers with at least 500 identified fastballs,
and, finally, don’t forget that the pitch-f/x data is not complete, so
not all pitchers are included in the analysis. In any case, 120 pitchers threw enough identified fastballs to make it into
my sample.

Here are the top 20 fastballs of 2007:

Best Fastballs of 2007
+-------------------+-------+--------+-------------+-------+------------+-------------+
| Name              | Pitch | NP_nip | runs100_nip | NP_ip | runs100_ip | runs100_tot |
+-------------------+-------+--------+-------------+-------+------------+-------------+
| Bell_Heath        | FB    |    492 |        -2.0 |    81 |       -6.6 |        -2.7 |
| Young_Chris       | FB    |    770 |        -1.4 |   169 |       -7.9 |        -2.6 |
| Howry_Bob         | FB    |    539 |        -2.2 |   122 |       -4.2 |        -2.6 |
| Burnett_A.J.      | FB    |    554 |        -1.4 |   104 |       -5.0 |        -2.0 |
| Greinke_Zack      | FB    |    494 |        -1.3 |    99 |       -5.3 |        -1.9 |
| Kazmir_Scott      | FB    |    582 |        -1.8 |    96 |       -1.6 |        -1.8 |
| Correia_Kevin     | FB    |    475 |        -0.8 |    90 |       -7.1 |        -1.8 |
| Putz_J.J.         | FB    |    524 |        -2.2 |    90 |        0.7 |        -1.8 |
| Webb_Brandon      | FB    |   1020 |        -0.6 |   302 |       -5.3 |        -1.7 |
| Peavy_Jake        | FB    |   1257 |        -1.3 |   224 |       -3.3 |        -1.6 |
| Schilling_Curt    | FB    |    470 |        -1.8 |   111 |       -0.6 |        -1.6 |
| Penny_Brad        | FB    |   1213 |        -0.7 |   295 |       -4.6 |        -1.5 |
| Wilson_C.J.       | FB    |    558 |        -0.5 |   106 |       -7.1 |        -1.5 |
| Germano_Justin    | FB    |    687 |        -0.9 |   174 |       -3.5 |        -1.4 |
| Gorzelanny_Tom    | FB    |    463 |        -0.9 |    94 |       -3.8 |        -1.4 |
| Hughes_Phil       | FB    |    561 |        -1.5 |   113 |       -0.3 |        -1.3 |
| Hill_Rich         | FB    |    873 |        -1.6 |   187 |        1.1 |        -1.2 |
| Smoltz_John       | FB    |    698 |        -1.2 |   182 |       -1.3 |        -1.2 |
| Francisco_Frank   | FB    |    432 |        -0.9 |    82 |       -2.8 |        -1.2 |
| Morrow_Brandon    | FB    |    567 |        -0.9 |    79 |       -3.6 |        -1.2 |
+-------------------+-------+--------+-------------+-------+------------+-------------+
Notation:
NP - number of pitches
runs100 - runs per 100 pitches
nip - not-in-play
ip - in-play
tot - all pitches

Padres setup man Heath Bell throws hard—he averaged
above 96 mph for his 573 fastballs captured by pitch-f/x. His
teammate Chris Young, on the other hand, throws his fastball at
average speed (91 mph). While another teammate, Justin Germano, is a
confirmed soft-tosser, his fasty averages just 87 mph.

When you compare the not-in-play numbers with the in-play numbers, you
see some interesting things. J.J. Putz was actually below average
when the ball was put into play (six home runs off the fastball), but
was very good when the ball was not put into play. Putz, perhaps
implicitly, realized this and was able to limit the number of balls
put into play (only 90 out of 614 fastballs). Brandon Webb was just
the opposite—he was more effective on balls in play and in
fact, had one of the highest ratio of balls in play to pitches thrown
in this sample.

Sliders

Pitchers throw a lot of fastballs, so we had the luxury of requiring at
least 500 pitches when searching for the best fastball. If I made the
same requirement on sliders, I’d end up with 10 pitchers, which isn’t
much fun. Instead, I simply selected the top 20
pitchers in terms of number of sliders recorded by pitch-f/x. Here’s
the resulting list, ranked by runs per 100 pitches.

Sliders in 2007
+-------------------+-------+--------+-------------+-------+------------+-------------+
| Name              | Pitch | NP_nip | runs100_nip | NP_ip | runs100_ip | runs100_tot |
+-------------------+-------+--------+-------------+-------+------------+-------------+
| Marcum_Shaun      | SL    |    354 |        -1.3 |    87 |       -9.4 |        -2.9 |
| Blanton_Joe       | SL    |    361 |        -1.6 |    69 |       -7.5 |        -2.6 |
| Litsch_Jesse      | SL    |    368 |        -1.4 |   111 |       -5.3 |        -2.3 |
| Buehrle_Mark      | SL    |    378 |        -2.2 |   133 |       -1.1 |        -1.9 |
| Smoltz_John       | SL    |    540 |        -3.6 |   116 |        6.7 |        -1.8 |
| Young_Chris       | SL    |    432 |        -1.8 |    67 |        0.5 |        -1.5 |
| Hernandez_Felix   | SL    |    394 |        -2.3 |    52 |        5.8 |        -1.3 |
| Haren_Dan         | SL    |    586 |        -1.9 |   119 |        2.5 |        -1.2 |
| Gaudin_Chad       | SL    |    428 |        -2.3 |    87 |        4.6 |        -1.2 |
| Halladay_Roy      | SL    |    554 |        -1.5 |   171 |        0.1 |        -1.1 |
| Maddux_Greg       | SL    |    433 |        -1.3 |   161 |       -0.1 |        -1.0 |
| Marquis_Jason     | SL    |    373 |        -1.1 |   110 |       -0.8 |        -1.0 |
| Batista_Miguel    | SL    |    995 |        -1.0 |   195 |        0.7 |        -0.7 |
| Contreras_Jose    | SL    |    371 |        -1.0 |    98 |        0.4 |        -0.7 |
| Peavy_Jake        | SL    |    555 |        -1.1 |   139 |        1.6 |        -0.6 |
| Speier_Justin     | SL    |    357 |        -1.7 |    89 |        4.3 |        -0.5 |
| Matsuzaka_Daisuke | SL    |    348 |        -1.3 |    86 |        3.0 |        -0.4 |
| Millwood_Kevin    | SL    |    408 |        -0.6 |   132 |        0.6 |        -0.3 |
| Vazquez_Javier    | SL    |    398 |        -1.7 |    86 |        7.3 |        -0.1 |
| Davis_Doug        | SL    |    544 |        -0.7 |   153 |        2.3 |         0.0 |
+-------------------+-------+--------+-------------+-------+------------+-------------+

Now the sample size issue is becoming more important; more of these
guys have fewer than 100 in-play pitches and you can see the runs100
values for in-play pitches are jumping around quite a bit. I doubt
Shaun Marcum‘s -9.4 runs/100 pitches on balls-in-play will hold up as
we get more data for him. But, hey, give the man credit, this is what he did in 2007.

Smoltz, of course, is famous for his slider and he is found high on
this list. I wonder if his poor showing on balls-in-play (+6.7 runs
per 100 pitches) might be a statistical fluctuation, which will come
down in time, moving him up on this list.

Change-ups

Here are the results for change-ups.

Change-ups in 2007
+------------------+-------+--------+-------------+-------+------------+-------------+
| Name             | Pitch | NP_nip | runs100_nip | NP_ip | runs100_ip | runs100_tot |
+------------------+-------+--------+-------------+-------+------------+-------------+
| Francis_Jeff     | CU    |    323 |        -1.8 |   104 |       -4.2 |        -2.4 |
| Blanton_Joe      | CU    |    258 |        -0.9 |    98 |       -5.2 |        -2.1 |
| Vazquez_Javier   | CU    |    259 |        -2.1 |    69 |       -0.7 |        -1.8 |
| Hendrickson_Mark | CU    |    234 |        -1.2 |    84 |       -3.1 |        -1.7 |
| Glavine_Tom      | CU    |    281 |         0.7 |    93 |       -8.5 |        -1.6 |
| Marcum_Shaun     | CU    |    322 |        -1.4 |    77 |       -1.4 |        -1.4 |
| Weaver_Jered     | CU    |    285 |        -1.7 |    77 |        1.2 |        -1.1 |
| Gaudin_Chad      | CU    |    517 |         0.6 |   153 |       -5.0 |        -0.7 |
| Contreras_Jose   | CU    |    342 |        -1.2 |    68 |        2.3 |        -0.6 |
| Buehrle_Mark     | CU    |    388 |         0.1 |   130 |       -0.4 |         0.0 |
| James_Chuck      | CU    |    350 |        -1.0 |   113 |        3.2 |         0.0 |
| Danks_John       | CU    |    251 |        -2.8 |    91 |        8.0 |         0.1 |
| Rogers_Kenny     | CU    |    288 |         0.2 |    99 |        0.0 |         0.2 |
| Willis_Dontrelle | CU    |    249 |         0.6 |    76 |       -1.0 |         0.2 |
| Colon_Bartolo    | CU    |    313 |        -1.6 |   105 |        6.0 |         0.3 |
| Capuano_Chris    | CU    |    260 |        -2.2 |    81 |        9.7 |         0.6 |
| Washburn_Jarrod  | CU    |    311 |         1.2 |    94 |       -0.6 |         0.8 |
| Penny_Brad       | CU    |    268 |        -1.4 |    82 |        9.9 |         1.2 |
| Moyer_Jamie      | CU    |    249 |        -2.1 |    73 |       17.6 |         2.3 |
| Maroth_Mike      | CU    |    255 |         0.9 |    71 |        9.5 |         2.8 |
+------------------+-------+--------+-------------+-------+------------+-------------+

Jeff Francis of the Rockies leads the list, with Blanton, Vazquez,
Hendrickson and Glavine rounding out the top five.

I found it
interesting that Jamie Moyer, who is famous for his change-up, fared so
poorly with it in 2007. Opposing batters just murdered it when they
managed to put it in play, to the tune of 17.6 runs worse than average
per 100 pitches. He was likely unlucky on balls in play—at least I hope he was!

Curveballs

Finally, we come to the curveball.

Curveballs in 2007
+-----------------+-------+--------+-------------+-------+------------+-------------+
| Name            | Pitch | NP_nip | runs100_nip | NP_ip | runs100_ip | runs100_tot |
+-----------------+-------+--------+-------------+-------+------------+-------------+
| Rodriguez_Wandy | CB    |    293 |        -1.7 |    44 |      -14.7 |        -3.4 |
| Burnett_A.J.    | CB    |    399 |        -2.6 |    39 |       -8.9 |        -3.1 |
| Beckett_Josh    | CB    |    348 |        -2.6 |    47 |       -3.9 |        -2.8 |
| Peavy_Jake      | CB    |    390 |        -3.3 |    80 |        1.9 |        -2.4 |
| Marmol_Carlos   | CB    |    414 |        -2.4 |    50 |       -3.0 |        -2.4 |
| Haren_Dan       | CB    |    534 |        -1.6 |   108 |       -5.6 |        -2.3 |
| Perez_Oliver    | CB    |    287 |        -2.5 |    35 |        1.5 |        -2.1 |
| Arroyo_Bronson  | CB    |    381 |        -2.5 |    88 |        0.5 |        -2.0 |
| Weaver_Jeff     | CB    |    244 |        -1.7 |    76 |       -3.0 |        -2.0 |
| Sabathia_C.C.   | CB    |    272 |        -2.5 |    57 |        0.8 |        -1.9 |
| Lackey_John     | CB    |    663 |        -1.7 |   139 |       -0.4 |        -1.5 |
| Bell_Heath      | CB    |    285 |        -2.2 |    65 |        2.2 |        -1.4 |
| Hill_Rich       | CB    |    397 |        -2.0 |    65 |        3.4 |        -1.2 |
| Wells_David     | CB    |    375 |        -2.0 |    93 |        4.1 |        -0.8 |
| Blanton_Joe     | CB    |    261 |        -0.9 |    63 |        0.4 |        -0.7 |
| Santana_Ervin   | CB    |    450 |        -1.9 |    63 |        8.1 |        -0.6 |
| Washburn_Jarrod | CB    |    351 |        -1.6 |    88 |        3.4 |        -0.6 |
| Meche_Gil       | CB    |    277 |        -1.3 |    49 |        3.7 |        -0.5 |
| Halladay_Roy    | CB    |    393 |        -2.1 |    81 |        8.1 |        -0.4 |
| Germano_Justin  | CB    |    331 |        -1.3 |    75 |        9.3 |         0.6 |
+-----------------+-------+--------+-------------+-------+------------+-------------+

Oops, how’d Wandy get in there? Actually, I am told that Rodriguez does have a very good curveball. In any case, mentally discounting him for that
unsustainable -14.7 runs per 100 pitches for balls in play, our top
three curveballs belong to Burnett, Beckett and Peavy, which sounds
pretty good to me.

Wrapping up

I must confess, while I think the method I’ve used here is sound, I’m
feeling just a bit unsatisfied—I’m thinking that to really
nail this down, we need more data. With many pitchers having only a
few hundred pitches recorded for any given type of pitch, there is
necessarily a fair amount of noise in the run value that we end up with. So, you should consider these results more of a rough guide than
a definitive list of the best pitches.

There’s not much we can do about that right now, except be glad that
the pitch-f/x system will be in place in all parks for the 2008 season.

References & Resources


Further Reading: As many of you know, there are several people delving into the pitch-f/x data and producing excellent research. Lately, there has been quite a bit
activity on the run value of individual pitches, and I’d encourage folks to look at recent work by target="new">Joe P. Sheehan and
Mike Fast
.



Pitch Classification: My pitch classification scheme is mostly unchanged since I described it in this article. However, I would like to acknowledge
fellow pitch-f/x researcher Mike Fast, who noticed a small problem with my classification and even suggested how to fix it. So, thanks to Mike for that.

Print Friendly
 Share on Facebook0Tweet about this on Twitter0Share on Google+0Share on Reddit0Email this to someone
« Previous: Offseason blockbusters:  February (Part 2:  1968-2008)
Next: TUCK! sez: Finally!! »

Comments

  1. Shaq said...

    How was table 1 derived? I’m having a tough time drawing the connection between the triple slash and the values in that table

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *