December 6, 2013

THT Essentials:
 Fangraphs Player Search:

And here's the full roster.

#### Get It Now!

The tenth Hardball Times Annual is now available. It's got 300 pages of articles, commentary and even a crossword puzzle. You can buy the Annual at Amazon, for your Kindle or on our own page (which helps us the most financially). However you buy it, enjoy!

## Search THT:

Or you can search by:

#### THT E-book

Third Base: The Crossroads is THT's e-book, available for \$3.99 from the Kindle store. The good news is that anyone can read a Kindle book, even on a PC. So enjoy the best from THT in a new format.

Get your very own THT merchandise from our CafePress store. We've got baseball caps, t-shirts, coffee mugs and even wall clocks with the classy THT logo prominently displayed. Also, check out the THT Bookstore. Please support your favorite baseball site by purchasing something today.

# Pitch run value and count

by Max Marchi
December 04, 2009

Pitch run values have been around for a while. When you assign the run value to a pitch, two factors contribute to the final number: the outcome of the pitch and the count on the batter before the pitch was thrown.

As John Walsh showed when he first introduced pitch run values, the difference between a strike and a ball is much higher on a full count (-.349 vs .271) than on the first pitch (-.044 vs .038). So the pitch count is already factored in the equation and we can forget about it, right?

Not so fast.

You probably remember that many times our (now Tampa Bay Rays') Josh Kalk, when presenting the most effective pitches or the most lethal pitch combinations, specified that he had adjusted run values for the pitch count. And surely, you have read, in the comment section of an article, MGL criticizing the author for not having adjusted for pitch count.

What's happening? Haven't we already accounted for pitch count? A strike on 1-0 has a value of -.053 runs, while it's -.062 on 0-1. Why does the need for an adjustment resurface?

Let's go graphical.

A slider is thrown by a right-handed pitcher at the location shown above to a right-handed batter. What's the expected run value of such a pitch? Here we are oversimplifying, pretending that only the location influences the effectiveness of the pitch, while Jeremy Greenhouse at Baseball Analyst has proposed a more advanced model that makes run value dependent on location, movement and speed.

Here's the average run value of a slider (from a righty to a righty) according to its location (data MLB 08/09).

The hypothesized pitch, still visible on the chart, will produce on average -0.008 runs.

What happens if we calculate the expected value for the same pitch on a 1-0 count versus a 0-1 count?

Something is counterintuitive when comparing the pair of charts above: Batters fare better on sliders down the middle when they are behind 0-1. The possible explanation is that when they're ahead 1-0, hitters aren't sitting on the slider (or they are simply waiting for an easier pitch to hit), thus the outcome is usually a strike (-0.053 runs). On the contrary, on a 0-1 count, they can't afford to fall behind 0-2, thus they swing at sliders clearly in the zone with moderate success.

However, our pitch is expected to produce 0.017 runs if delivered on 1-0, -0.016 runs on 0-1.

You surely don't need next chart to know what's going on, but let me show it just to confirm what everyone is expecting.

Hitters expand their zone when they fall behind (the 50 percent swing zone on 0-1 has an area two and a half times greater than on 1-0), thus swinging at pitches that are harder to reach or to make good contact with.

Add to the mix that a pitcher who is ahead tries to exploit the expanded strike zone of the batter (look below), and that a batter who is sitting on a favorable count can afford to let go a pitch he doesn't like, and you are back to the run value charts shown above.

Let's now make up an extreme example. Suppose two identical pitchers exist. They both have an average fastball and a very peculiar slider: that slider always nails the location we have used insofar.

Now, Pitcher A throws the slider only on 1-0 counts, while Pitcher B delivers it only when on 0-1. Using run values unadjusted by pitch count would show that Pitcher B's slider is better than Pitcher A's, while the only difference is in the pitch selection.

Thus, while pitch count is already factored in the calculation of pitch run values, we can't let it out of our analyses, especially when evaluating effectiveness of pitches/pitch combinations.

References and Resources
John Walsh—Searching for the game’s best pitch.
Joe P. Sheehan—More Run Values.

After creating a baseball rendition of The Beatles' Sgt. Pepper cover, Max began his baseball writing because he needed an excuse to show the picture. He wrote for an Italian audience for six years before making the jump to The Hardball Times. You can contact him by e-mail.

Nick Steiner said...

Very nice Max.  Now how in the world do you suggest we adjust for this when dealing with large amounts of aggregate data and multiple pitch types and locations?

Posted 12/04  at  07:44 AM
MGL said...

There is one other issue.  When you look at the 0-1 and the 1-0 data, the pool of pitchers is different.  In order to find the difference between the run values of the 0-1 and 1-0 sliders, you have to control for the identity of the pitcher by using the “delta method” or something like that.

IOW, the difference between the 1-0 and 0-1 slider is NOT .017 minus -.016 since those numbers are based on two different pitcher pools.  What the actual difference is (for any given pitcher), we have no idea.  I would guess that the pitchers who throw 1-0 sliders have better sliders, or at least more control (actually it may be a worse slider - more control but less bite - for example, Lidge throws a great slider with little control).

So, for example, a pitcher with a great slider which lacks a lot of control, might have a very good run value at 0-1, but if he threw it at 1-0, it might have a terrible run value because it would rarely be in the strike zone and the batter wouldn’t swing at it.  So the gap would be larger for this type of pitcher.  For a pitcher with a slider than he can control, the gap might be very small.  I am speculating of course.

Posted 12/04  at  05:06 PM
Max Marchi said...

Nick, I believe most of the time you just have to keep in mind the issue: go with the straight values, mention that different pitch selection might confound the results you are seeing. Maybe check for the pitchers at the extremes of whatever list you produce that they don’t have a peculiar behaviour on pitch selection.

Sometimes, consider whether a stratified analysis is better. Probably ahead vs behind in the count is enough for many purposes (watch out for 0-0 and 3-0 counts!).

Theoretically I would try to build a model to predict the expected run value of a pitch given its type (or, better, its speed and movement), its location, and the count when it was thrown.
I don’t think it’s an easy task, maybe multilevel modelling would be required.

I don’t remember Josh explicitly writing how he did his adjustments, and I don’t think he’s in a position now to share his methods.

MGL, what you added is very important and confirms my beliefs that multilevel modeling is necessary for this issue. Unfortunately I need a lot more training on the subject - I do not have the slightest idea on how to combine multilevel modeling and loess smoothing.

Posted 12/05  at  12:20 PM
Nick Steiner said...

I’ve been looking into a way to predict run value based of it’s speed, movement and location, and it’s pretty much impossible.  You’re right that you would need a type of multivariate LOESS, but even that doesn’t emit a closed form equation, so it might not even be applicable.  I think, you would need something like a Neural Net, and a very well calibrated one at that.

Posted 12/05  at  07:27 PM
Page 1 of 1
Commenting is not available in this weblog entry.

<<Previous Article:  This annotated week in baseball history: Nov. 29-Dec. 5, 1969