The Internet cried a little when you wrote that on it

“The woman Folly is loud; she is undisciplined and without knowledge.” (Proverbs 9)

It’s easy to look stupid doing baseball analysis. Anyone who hasn’t isn’t trying. But you shouldn’t make it any easier that in has to be. Anything from the list below is almost guaranteed to get you into trouble unless you’ve really done your homework. Beware!

“Scientia potentia est.” (Bacon)

The PITCHf/x data set is an incredible resource, rich and complex, revealing depths to the game of baseball heretofore unseen. It can be a difficult data set to use well, particularly for the novice analyst. However, it also rewards the diligent.

If you call out for insight and cry aloud for understanding,
and if you look for it as for silver and search for it as for hidden treasure,

Then you will understand what is right and just and fair—every good path.
For wisdom will enter your heart, and knowledge will be pleasant to your soul.
Discretion will protect you, and understanding will guard you. (Proverbs 2)

Let’s cover the potential pitfalls in more detail.

1. [Pitcher] is using his [pitch type] a lot more this year, according to PITCHf/x, where [pitch type] = two-seamer, cutter, etc.

The pitch classifications in the PITCHf/x data, as shown on Brooks Baseball, Texas Leaguers and Fangraphs (not to be confused with the BIS pitch classifications also on Fangraphs), are done by an algorithm developed by Ross Paul at MLB Advanced Media. Ross has made significant updates to this algorithm every year, and one of the most noticeable impacts is that the percentage of pitches classified as two-seam fastballs has increased with each update. This does not mean that the pitchers themselves have changed anything about their pitch selection or the movement on their pitches.

If you want to know if a pitcher has really added a new pitch type, explore his data on a site like Texas Leaguers and see if he’s added a completely new pitch cluster from the previous year. Don’t pay attention to how the pitches are labeled; learn for yourself how the different pitch types behave. When you have mastered that, then you’re ready to identify when a pitcher has added a new pitch.

For example, take a look at the following two graphs that can be found on Texas Leaguers for Ricky Romero. They show the pitch speed versus spin axis angle for 2009 and 2010. The MLBAM pitch classifications are indicated by the different symbols on each graph. Note how Romero’s stuff has remained mostly the same from 2009 to 2010, but the classifications attached to the pitch clusters have changed dramatically.

image
image

2. [Pitcher] is tipping his curveball by releasing it higher than his other pitches.

The most important thing to know here is that almost all PITCHf/x data presented on the web, including that at the three websites listed in Item 1, lists “release” points measured at 50 feet from home plate. Most pitchers release the ball at about 54-55 feet from home plate, give or take a foot or so.

The next thing to think about is some simple physics. If you throw one type of pitch that drops less than a foot on its way to home plate (let’s call it Fastball) and another type of pitch that drops three feet on its way to home plate (let’s call it Curve), and you want both to end up in the strike zone, which one do you think you’ll need to release on a trajectory that’s initially going higher? Right. Curve. And where do you think you’ll see the biggest separation between the trajectories of the two types of pitches? Yeah, 5-10 feet out from release would be a good place to start to look. It just so happens that’s where all the PITCHf/x data you see is looking, too.

Of course, not every pitcher has a big-dropping, slow curveball such that he needs to release it pointed noticeably higher than his fastball, which is why you don’t see a huge separation for every pitcher. That’s not to say some pitchers don’t tip their pitches, or even that batters don’t pick up the difference between fastball and curveball that way even if it’s intentional by the pitcher. But take a closer look before you assume there’s something wrong with seeing the curveball higher than the fastball 50 feet out from home plate.

3. [Pitcher] has lost velocity on his fastball, but he’s exchanged speed for movement.

There are typically at least two claims in these sorts of articles. The first is that a pitcher has lost fastball speed. That does happen on occasion and sometimes authors are correct about that. The smaller the sample size, the more suspicious you ought to be. Temperature has an effect on fastball speed, and for data from cold games in April you might want to pay attention to that.

The source for velocity information can be a radar gun or PITCHf/x cameras, and both obviously have measurement errors associated with them. In the PITCHf/x case, the camera systems are usually pretty good, but they can have calibration errors that can cause mismeasurement of pitch speeds by up to a couple of mph. A good way to check for this is to compare road data to home data.

For example, here you can see that Zack Greinke’s pitch speeds have been 1-2 mph faster at home than on the road this year, even as his velocity has increased throughout the year:

image

Once you are satisfied that the fastball velocity loss is real and not just a measurement artifact, you can turn to the second claim that so often accompanies the first. At least it does if the pitcher is still being effective with his lower velocity. If he’s not, you may find the opposite claim—that he has also lost movement—which we will cover in a moment.

Like changes in velocity, the changes in movement are subject to measurement and calibration errors in the PITCHf/x cameras. Shifts of 2 to 4 inches due to errors are fairly common. Again, comparing road data to home data is one way to check for these kinds of errors.

Here you can see that the spin deflections for Zack Greinke are shifted up and to the left by a couple of inches at home relative to the data from his road starts in 2010:

image

The other confusing factor here is the question of what “movement” really means and whether a change of a couple of inches is something that would even affect a batter. Or whether a shift in the observed direction is a good thing or a bad thing. If the pitcher has improved, it’s a good thing. If the pitcher has performed poorly, it’s a bad thing. QED. Right? If the author bothers to check any of these critically important facts, he or she is a better man or woman than 95 percent of the baseball writers out there.

4. The [expletive] umpire called a strike on a pitch to [left-handed batter] that was way outside!

Many analysts have shown that the average strike zone called by umpires extends a couple of inches outside the rulebook zone to right-handed hitters and several more inches beyond that to left-handed hitters. While an author who notes that the pitch was not inside the rulebook strike zone may be technically correct, the point that the umpire was being unfair to the team of interest may not be correct.

Fans love to hate umpires, so I won’t belabor this point.

5. [Pitcher] is struggling this year because he’s lost movement on his pitches.

We touched on this earlier in the “exchanging speed for movement” section. The first thing to check is whether any change in movement can be attributed to either a change in classification of the pitch types or to PITCHf/x measurement errors. I could probably stop right there because (1) almost no analyst does that and (2) most pitchers don’t really change the movement on their pitches that much, at least not that I’ve noticed. Josh Kalk did find some changes in movement within a given game for a pitcher who was tiring, but that’s a different kettle of fish.

If someone demonstrates that pitchers actually do lose movement on pitches (and what does “lose movement” mean on a slider, which typically has little or no spin movement anyway?), the next thing to ask is whether the change in movement matters to the performance of the pitcher. Dave Allen, among others, has given us a good beginning on the research on this question, but we’re far from having the final answers at this point. Unless the article addresses these topics, the talk about a pitcher’s performance changing because he lost movement is worthless fluff.

6. [Pitcher] is doing better this year because he’s increased the movement on his pitches.

This is clearly a corollary to the previous point, but flipped on its head. The same caveats apply. Pitchers typically add movement by trying brand-new pitch types. They don’t all of sudden start snapping their wrists harder or finding extra snap on their fingertips to make the same old pitch type move more. I’m not saying that’s impossible, but if I had a nickel for every time someone claimed it happened and had to give back a dollar for every time it was true, I’d be well ahead in the game.

What should I do?

Instead of jumping immediately to speed, release point and spin movement for explanations simply because they are the numbers listed on Fangraphs, try looking at the pitcher’s results—balls in play, swinging strikes, called strikes, fouls and called balls—and working backward from there. Is the pitcher performing better because of better results on balls in play? Why? Look at his ball-in-play charts and see what you can learn.

Is the pitcher performing better because he’s getting more strikeouts? What pitch type is he getting those whiffs with? And to which handedness of batter? The same pitch type from the pitcher looks very different to different handed batters.

Is the pitcher missing the strike zone more often with a certain pitch type? If so, where is he missing? Remember to consider batter handedness. Is he throwing the same location as always but now batters are laying off pitches below the zone that they used to swing at?

I don’t intend this list to discourage newcomers from dipping their feet into the PITCHf/x water, playing around with the data, and even publishing unpolished results. Yes, I want to discourage sloppy analysis, but scaring people away is not the point. Educating people about common mistakes and encouraging good principles of analysis is what I’m after. I’ve also noticed a certain awe and air of invincible authority ascribed to PITCHf/x analysis around the web, whether it be shoddy or excellent. I hope to educate the reader of PITCHf/x analyses a little about how to tell the good wine from the cheap wine.

Print Friendly
« Previous: Preview: The College World Series
Next: Theoretically throwing the 2010 draft class into the top-100 mix »

Comments

  1. Steve Slow said...

    Mike, this is excellent. I’ll admit that I’ve been guilty of many of these assumptions in the past, but I’ve simply never known better until now. Those are all some great suggestions and I’m sure I’ll be referring back to this article over the next couple of weeks.

    Now that you’ve written about what we shouldn’t do, any chance you could write an article about what we should do? I know you touched upon this throughout the article and gave many suggestions, but I’d love even more. If you’re analyzing a pitcher that just pitched poorly, how do you tell what was the issue? Well, wait, you answered that in the article. How should I phrase this…

    If we assume that pitch speeds, movements, and pitch usages all don’t vary much from start to start for a pitcher (and the differences we see are mostly a result of system differences), then what things can we take from looking at a player’s pitch f/x numbers for a start? Obviously whiffs and thrown strikes are great, but what else? Location of certain pitches? Pitch choice (for instance, choosing a change-up against a lefty is a good idea)? I don’t think I’m being very clear, but I would love for you to expand your last section more. Or else if you have another article you can direct me to in order to learn more, point away!

  2. Ike Hall said...

    Steve,

    I’m guessing you are thinking of the case where a guy has a bad start or outing, and actually is throwing the ball with less zip on it than he usually does.  This can and does happen (but probably not as often as you might think), and I don’t think Mike will take issue with that…

    So lets say PitcherX has a bad game, and we desperately want to know “why”? as much as we can…Velocity is usually the first thing many people turn to to try find an answer.  So you pull up the pitchf/x data and lo, PitcherX’s fastball was averaging 91 mph that game instead of his usual 93.  What Mike is saying is that you can’t just stop there and say you’ve found something (at least not if you want to convince anyone). 

    We know for instance that some parks pitchf/x systems are consistently “slow” and others are consistently “fast”.  So do you have some data from PitcherX pitching a “normal” game or two at the same park?  If not, do you have data for any other pitchers pitching at this pitchers home park and the park in question?  Can you establish how far off the park in question typically is from the pitchers home park?  (relievers are particularly helpful in this regard).  Next, could it be a one-day (or one-week) calibration issue with the system?  These are rare, but they can happen.  How do the other pitchers who pitched in the game stack up against their usual velocities?  Are they all low by about the same amount?  If so then PitcherX probably isn’t having a velocity problem.  If not, by now you’ve got a better idea that maybe he really is having an issue.

    Same things as above go with movement.  The thing to remember when you do these kinds of reports is that each pitchf/x system in each ballpark is slightly different, and there are bound to be subtle differences between them. When you take all games in all parks in aggregate, you can probably safely ignore a lot of those differences.  But when you get down to wanting to look at what happened in one single game, you can’t.  Especially if you are looking at differences that are on the order of a couple mph, or an inch or two of break…etc.  Because those could easily be the result of the subtle differences in the pitchfx system at two different parks.

  3. Mike Fast said...

    Thanks, Steve.  Ike definitely has some good comments. 

    I will say that I have mostly given up trying to figure out why a pitcher had a particularly good or bad game.  The one thing I would turn to if I had to have an explanation is location.  Leave pitches over the middle of the plate and it will eventually get you in trouble.  But it’s not as simple as that, either.  Not nearly every pitch over the middle of the plate will get drilled for a hit.  Pitch sequencing and deception play a role, too.

    I believe we need to understand more about how the pitcher-batter matchup works before we can really understand what happens on the smaller time scales like the game level.

  4. BenF said...

    Great article Mike.  There’s a lot of stuff there about the details behind PitchF/X that people aren’t aware of (me included), and not because they’re ignoring it, just because they’ve never been told.

  5. Astromets said...

    nice article!! wish i could take a class on this stuff taught as well as you wrote the article and just do this for a living – ah baseball is so good!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *