Game theory is the next Moneyball

I had a great time at the second annual SABR Analytics Conference last week. The SABR folks put together another fine program, with several excellent speeches, panels, analytic presentations, and sessions set aside to discuss the business of baseball (Ben Lindbergh has some thoughts about the latter at Baseball Prospectus). I didn’t attend everything, but I want to mention some of the things I did.

The Diamond Dollars case competition For the second year, Vince Gennaro created a baseball business case as a teaching opportunity for students interested in the business of baseball. Students at 11 schools read the case, analyzed their options and traveled to Phoenix to present their recommendations to a panel of judges. I was one of those judges and, as always, I learned something in the process.

This particular case was set one year from now, in the Angels’ front office. The Angels are considering whether to sign Mike Trout to a long-term contract or to let him run through his arbitration years, and the students have been asked to present their recommendations to General Manager Jerry DiPoto. (Footnote: I look nothing like Jerry DiPoto, though I’ve been told that I sound like Jeff Luhnow and look like Bill James). We’re talking bullet points, graphs, financial numbers, baseball statistics, Powerpoint and the whole nine yards.

I’m always particularly intrigued by the ways different students account for the risk of long-term contracts but, truth be told, there’s not a lot of risk to signing Mike Trout to a seven-year deal (which is what most teams recommended). The primary issue was the amount of “arbitration discount” the students would be willing to concede for a long-term deal. Congratulations to NYU and Pepperdine for winning the undergraduate and graduate divisions, respectively.

The Rawlings announcement There was a bit of cloak and dagger suspense concerning a “special announcement” on Friday’s agenda. SABR and Rawlings used the time to announce that they will be changing the voting rules of the Gold Glove Awards. Moving forward, Gold Glove winners won’t be selected based solely on voting by major league managers and coaches. Rawlings also will include a statistic in determining the winners (though voting will still account for the majority impact) to be called SDI (Saber Defensive Index).

A panel of folks (anonymous at this point) will come together to construct the SDI. I assumed that this will be an open-source stat, but the press release indicates otherwise. If the stat is open-source, this is a very good thing (and kudos to SABR for making it happen). If the stat isn’t open-source, then it’s a bad thing. There shouldn’t be any closed doors when it comes to handing out awards. Let everyone know how the players are being judged, and be willing to take your lumps from those who disagree. That’s how we improve.

The presentations I wanted to go to all the presentations, but many were scheduled against each other. I did see Graham Goldbeck present some really interesting data regarding how deep in the strike zone batters tend to hit the ball. It turns out that the optimal place to hit a ball for a home run is about a foot in front of the plate and that a couple of batters, such as Alfonso Soriano and Alexei Ramirez, typically hit the ball nearly two feet in front of the plate.

It also turns out that pitchers can be measured by how deep in the zone batters make contact (hint: in general, the faster the fastball, the deeper in the zone the ball is hit), though there are some notable exceptions—enough exceptions to make us want more data. Alas, this was HITf/x data (I think. Maybe it was the FIELDf/x data?) and it will be hard to come by.

The most important thing

What really opened my eyes, however, were the two Friday presentations about baseball and Game Theory. Game theory is a branch of economics in which economists and other mathy types study how competitors should compete against each other, given certain expectations and parameters. It’s complicated stuff, but the payoff can be sublime.

The penalty kick in soccer is an excellent example of game theory in sports. Let’s say you are a right-footed kicker and, given the way the ball comes off the foot, you typically kick the ball better when you kick it to your left (off the inside of your right foot). So you tend to kick the ball in that direction. However, the goalie knows you’re going to do this, so he (or she) anticipates your direction and dives to his (or her) right. However, you know the goalie is going to do this, so you kick to your right instead. And so on and so on.

Kind of like that scene from the Princess Bride, right? The one where the Dread Pirate Roberts (Wesley to you and me) puts poison in one of two cups and Vizzini tries to choose the un-poisoned one by “out-thinking” him? Like Vizzini, you may think there is no way out of this endless cycle of potential soccer kicks; that the answer is to just randomly pick one side or the other (or to switch goblets when the other guy isn’t looking).

You’d be half right. You do want your actions to be random and unpredictable. But you don’t want to choose your options 50/50, because you are more likely to score when kicking to your left. You know it and the goalie knows it. So how often should you randomly kick to your left? And how often should the goalie randomly dive in that direction?

The answer, given typical success rates in professional soccer leagues, is 61 percent of the time. And the goalie should anticipate your kicking in that direction 58 percent of the time. Don’t believe me? Here’s the mathematical proof. In game theory terms, this is called a “mixed strategy,” in which you pursue different strategies but at optimal rates.

The truly fascinating thing is that these percentages will lead to the best outcome for both sides. Neither the kicker nor the goalie will be able to improve their success by varying from these percentages over the long term. Game theorists say that this is the point at which both sides are “indifferent” to the other’s strategy. The extra cool thing is that professional soccer players actually fall in line with these percentages. Reality mirrors theory.

Naturally, the percentages will vary by player and goalie, depending on their relative strengths and weaknesses. Most kickers have different underlying strengths and weaknesses, as do most goalies. Good players will adjust their percentages according to the nature of the opposition. Which brings us to pitchers and batters.

In one presentation, Middlebury sophomore Kevin Tenenbaum (subbing for Dave Allen, who is now a professor at Middlebury after several years of publishing PITCHf/x analysis) applied game theory to pitchers and batters and used complicated mathematical models to determine where pitchers should locate pitches in 0-2 counts. I thought the presentation was excellent and the fundamental conclusions made sense, but the math was beyond me. I’m not going to try to explain it here.

A Hardball Times Update
Goodbye for now.

There is an easier way to become comfortable with game theory in baseball. Last December at THT, Matt Swartz published a five-part series on baseball and game theory. I considered them the most important sabermetric articles of 2012, though I admit that I’m biased. In the series, Matt laid out an entirely new way of thinking about what pitchers should throw on specific counts.

The basics of game theory
Applying basic game theory to pitch selection and finding that batters should take more often on 2-2 counts than 3-2 counts
Taking it a step further and showing that pitchers should throw their best pitch less often on 2-2 counts
Adding Bayes probability to factor in the batter’s decision process and calling him Willie Bayes
Relaxing some of the assumptions and pointing to future research

At the SABR Analytic Conference, Matt picked right up with that future research thing. He presented data from 2-2 and counts and 3-2 counts and found that batters actually do swing more often at 3-2 pitches than 2-2 pitches (on pitches both in and out of the strike zone). He also expanded his analysis and found that pitchers and batters follow predicted behavior (throwing strikes and swinging at pitches) across all counts. Reality reflects theory.

Then Matt really dove into the data and found that baseball players aren’t always maximizing their opportunities. He assigned run values to specific pitches and ball/strike situations by pitch type and compared the run value of fastballs across all counts to the run value of non-fastballs across all counts. According to game theory, the relative value of these two type of pitches should negate each other. Added together, they should equal zero.

Doesn’t happen. It’s the key table from Matt’s presentation, showing the relative difference in fastballs and non-fastballs by count:

image

Most importantly, Matt found that fastballs are underused with no strikes on the batter and overused with two strikes on the batter. On the other hand, he found that batters swing too often with two strikes on them. Reality is no longer reflecting theory.

Matt took it even further by examining different types of pitcher specialists—those with the most effective change-ups, sinkers, sliders and curveballs. I’m not going to review all of his findings because they were basically consistent with the above table. Plus, you can download Matt’s Powerpoint slides from the References section below. The bottom line, however, is that all types of pitchers follow the same type of behavior.

There’s gold in this information. I know I’m purposely overselling things when I say that game theory is the new Moneyball, but given all of the resources and data available to major league teams, how can they ignore this extra level of analysis that can directly yield results? Many teams have three or four (sometimes more) data analysts on staff. How about hiring someone with game theory experience as well?

During one of the panel sessions—the one with Brian Kenny, Bill James and Joe Posnanski—the panel talked about the trend toward breaking up a manager’s job into different roles. One of those roles was a bench/game strategy coach; someone to worry about the next steps in the game, get the bullpen ready, select the pinch hitters, decide when to bunt (if ever!), etc. It’s a good idea, and it screams for someone who can think in terms of game theory.

I don’t know how it will play out, but smart major league teams are going to get on top of this. Count on it.

References & Resources
Here are Matt’s Powerpoint slides. The link is to a Dropbox file, but you shouldn’t need a Dropbox account to view the presentation or download it.


Dave Studeman was called a "national treasure" by Rob Neyer. Seriously. Follow his sporadic tweets @dastudes.
36 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Carl
11 years ago

Awesome article.  Thank you so much.

The game theory explaination is excellent.  Makes me want to apply it to batters who refuse to hit against a shift.  If only Tex and Dunn (and their managers) could only understood that by not hitting/bunting the other way almost 40% of the time they’re hurting their results.

Alan Nathan
11 years ago

When Matt’s five articles came out last year, I read them all thoroughly, working out all the details.  All that stuff was new to me but Matt made it all make sense.  I missed Matt’s talk at SABRanalytics due to a conflicting talk.  However, I look forward to working through his slides. 

BTW, I love to tell my John Nash story.  From Wikipedia:  “At Princeton, campus legend Nash became ‘The Phantom of Fine Hall’ (Princeton’s mathematics center), a shadowy figure who would scribble arcane equations on blackboards in the middle of the night.”  I was a grad student in physics there at the time (early 70’s) and Fine Hall was co-joined with Jadwin Hall, which housed the physics department.  I would regularly see his work!  But I didn’t draw the connection with Nash until much later, when I first saw the movie A Beautiful Mind.

William Juliano
11 years ago

Game Theory works “in theory”, but sports situations usually introduce lots of different variables that needed to be taken into consideration, not the least of which is player skill. That’s why hitting against the shift isn’t necessarily a good application for game theory.

philosofool
11 years ago

This is a minor technical point, but you cannot actually be indifferent to optimal strategies because in most types of games if you don’t use your optimal strategy and your opponent does, you should expect to lose more that you would if you used an optimal strategy. Oddly, if your opponent uses a suboptimal strategy and you know about it, you may have a strategy that’s better than your optimal strategy. It is also in your opponents interest to make you think he’s using a suboptimal strategy….

Think about Rock, Paper, Scissors: if you know your opponent will play rock half the time and paper half the time, but never scissors, you should always play paper. But if you opponent is smart, you should pick rock, paper or scissors with equal probability. If you did that against the dummy who never plays paper, you would actually be worse off. (RPS is actually one of the lame games where playing optimally provides no advantage over suboptimal strategies. Most games aren’t like that, usually when you play optimally and your opponent doesn’t, you will win more than you would have if he played optimally.)

What makes a pair of strategies optimal is that if you and your opponent are both using optimal strategies, neither can unilaterally improve his results by changing strategy.

@Carl You probably don’t need to bunt more than about 20% of the time to be completely effective.  If you show a willingness to bunt into a shift just once, they are going to have to think seriously about whether they basically just awarding you first base and will probably play straight.

Carl
11 years ago

Guys, consider the shift from the other standpoint. 

For example, Minnesota refuses to shift, while Tampa just loves to.  Against a strong pull hitter who refuses to even try to bunt/hit the oter way (as far as skill, it is required but refusal to try to hit the other way strikes me as a combo of arrogance/laziness). Dunn/Tex hit much lower against Tampa than against Minnesota (both in theory and practice).  Minnesota is foolish for not trying a switch.

Carl
11 years ago

Philosofool,
You are probably right that a bunt (or two) would make a team stop a shift, which would improve the pull hitter’s future chances of a hit.

It would be interesting to see how many hits the other way it would take to get Joe Maddon/Tampa to halt shifts.

PS. On that last post, I typed switch, but meant shift.

Tim
11 years ago

There’s a very big problem with game theory and baseball which I noticed a few years back. (Whatever year it was Torii Hunter invented his “just take off” stolen-base strategy. 2006? He was still with the Twins.)

I ran some numbers on that and discovered that the equilibrium for pickoffs vs that strategy is really a lot of pickoffs. Far beyond what anyone throws now. If players do eventually get somewhere near equilibrium on base-stealing, the game is going to become unwatchably slow. So I’ve been kind of dreading the moment that teams start looking at game theory more closely.

Different Tim
11 years ago

As far as bunting to stop a shift, I don’t think the big bruisers bunting would stop a shift. It’s much more beneficial to let the guy take his chances at bunting, where he could pop out or foul into a two strike count very easily, instead of letting him have his choice of yanking the ball with all his might. At worst, he’s on first and it took less pitches than a walk, not trotting across home plate.

Guy
11 years ago

Studes:  Nice report.  One question I had is what is the evidence that hitters swing too much with 2 strikes?  Perhaps it’s in the presentation, but if so I missed it.

As I mentioned over at Tango’s blog, one potential problem with this analysis is that Matt hasn’t controlled for hitter quality (or base/out).  I suspect that once you controlled for those factors, you would find that pitchers are closer to optimal.  For any given count, pitchers won’t use FB and non-FB in the same proportions against both good and bad hitters.  For example,  breaking pitches will only be thrown at 2-0 and 3-0 to very good hitters, which is likely why the run value is higher.  Once you adjust for hitter ability, my guess is most of the apparent inefficiency will disappear.

studes
11 years ago

Guy, I believe Matt did adjust for batter quality and found it made only a small difference in the results.  Also, the two-strike thing is in the presentation (or else I misread it).

Sabertooth
11 years ago

Redefining gamer.

Guy
11 years ago

Thanks, Studes, now I see it (had to download the PPT to see the notes.)  However, I have to say I’m not convinced.  To know if there is really an inefficiency here, we need to control for hitter quality, pitcher quality, and—most important – platoon advantage.  My strong suspicion is that once all those factors are controlled for—which admittedly, is not easy to do—the run gap will mostly vanish.  One relatively easy test might be to re-run this table based only on RHP facing RHH (and maybe limit sample to 90-110 wRC+ and 90-110 ERA- players.)  My guess is the run differentials will shrink quite a bit.

Matt Swartz
11 years ago

Maybe. I may get a chance to do that, and that is one on my list of new ways to check this. But in the meantime, I’d have a hard time coming up with a compelling reason why there were such huge differences for changeup pitchers and curveball pitchers, and other subgroups with smaller platoon splits. Why wasn’t it bigger for RPs than SPs, when RPs have bigger platoon splits? Why was it similar across pitchers and hitters with similar K%? While I didn’t have handedeness in my file, there’s nothing about any of the results I have that suggests platoon splits are playing a major role here. This table held up across many subgroups to varying degrees. The “specialty secondary pitch” subgroups were by far the biggest factor.

studes
11 years ago

Guy, is there a specific reason you’re skeptical here? 

Speaking for myself, I have no problem believing that athletes behave suboptimally in key situations.  A good example is the third base coach’s decision about sending the runner home—you can make a good case that coaches aren’t aggressive enough because they’re “afraid” of having a runner thrown out at home.  There was another good example in this year’s THT Annual, in which Dave Allen found that batters don’t take enough advantage of the hole at first with a runner on—presumably because they’re *too* concerned about the double play and so avoid ground balls.

Guy
11 years ago

Studes:  I’m skeptical in general because among all the studies I’ve seen that identify suboptimal behavior in sports, at least 90% are soon debunked.  A good example is the Kovash/Levitt paper that Matt cites, which ignored the impact of count.  And you mention 3rd base coaches, but the only study I’ve seen on that turned out to be wrong too. (Heck, there isn’t even good evidence that OBP was undervalued pre-Moneyball!) So my default assumption is that this is likely wrong, but there are a few exceptions, and this could be one.  (And now I will have to read Dave Allen’s piece.)

In this particular case, I’m skeptical because Matt is saying that “not enough” FBs are thrown in counts where pitchers already throw a HUGE number of FBs (78% at 2-0, 94% (!) at 3-0).  How likely is that?  More importantly, the results suggest that pitchers only make mistakes at the extremes, when they have a clear advantage or disadvantage.  But those are exactly the times when the pitch distribution is likely to shift based on the pitcher:hitter matchup: with no strikes, I’d expect non-FBs to mainly be thrown to good hitters with a platoon edge; with 2 strikes, I’d expect non-FBs to be thrown disproportionately to weak hitters and when pitcher has platoon edge.  Since Matt’s results exactly match my intuition, I’m skeptical they reflect “irrational” behavior.  I also find it very unlikely that curveball specialists, or slider specialists, behave less optimally than anyone else.

Matt Swartz
11 years ago

They behave like other pitchers. Read through the footnotes carefully. It should clear up your skepticism. There’s good intuitive reason why copying is generally watch generates a learning by doing kind of optimality, and good reason why pitch type specialists would therefore be inclined to copy other pitchers too much. If you disagree, I’m okay, but if you want to publicly disagree, I’d suggest at least reading the notes I spent months putting together, and hours typing up cleanly for distribution after.

Guy
11 years ago

I understand the argument, Matt, I just don’t find it very plausible.  I think pitchers’ pitch distribution is based on their own trial-and-error results more than imitating other pitchers.  But I could be wrong.  Maybe you can find some other validation of your conclusions.  For example, you say that curveball pitchers “really make their mistakes” on 2-0 pitches.  So you could check to see if these pitchers have worse results on average at 2-0.  I looked at your two highlighted pitchers, and both Wainright (tOPS+ 172) and Volquez (tOPS+ 154) are slightly better than average (177) at this count.  Cole Hamels and Mark Buehrle are two other guys who should perform poorly at 2-0 according to your analysis, but both are better than average.  Doesn’t prove anything of course—just the first 4 guys I looked at. But if these specialized pitchers really do mess up at 2-0, there should be other evidence at the scene of the crime.

Matt Swartz
11 years ago

The pitchers you cite have great performances on 2-0 counts BECAUSE they have such great results on fastballs in 2-0 counts. These groups were not taking advantage of strengths- they were listed as irrational because they behaved like other pitchers in 2-0 counts despite very different results. Same thing with slider pitchers in 2-strike counts. Exceptional performance despite pedestrian fastball performances. When did throw sliders, they crushed. Same with curveball and changeup pitchers with 2-0 counts.

Guy
11 years ago

Sorry, I don’t understand your argument, Matt. If what distinguishes these pitchers is that they make poor (suboptimal) pitch selections at certain counts, then it seems to me we should expect to see worse performance at these counts (relative to the pitchers’ overall talent level).  How can great performance at a given count possibly be evidence of suboptimal pitch selection?

BTW, this old article by Max Marchi shows the large differences in platoon differential by pitch type: http://www.hardballtimes.com/main/article/platoon-splits-2.0/.  Seems likely that pitchers are often deploying different pitches for any give count, based on whether they have a platoon advantage/disadvantage.

Matt Swartz
11 years ago

No, suboptimality is just different values for different pitches. If you have a 200mph fastball, you should throw it more often than other pitchers. But you would still have better performances in 2-0 counts overall.

Guy
11 years ago

Matt: in every case I was comparing these pitchers to themselves. They pitch better at 2-0, relative to their overall OPS, than the average pitcher. If they are suboptimal in pitch selection at 2-0, but not at other counts, that should be apparent in their relative performance.

Mike
11 years ago

You are missing a minus sign in the 2 strikes, 1 ball box.

studes
11 years ago

Guy, the comment about batters swinging too much with two strikes and about Matt controlling about batter quality (in the notes) is on slide 27.

Matt Swartz
11 years ago

No, you want to measure performance on fastballs vs non-fastballs. Just doing well in a given count is not what to measure. It misses the obviously suboptimal behavior of failing to take advantage of a relative strength. Also I fail to see why you’d argue from anecdotes of individual pitchers.

Guy
11 years ago

C’mon Matt, I’m not arguing from anecdotes.  I was just suggesting that one way to validate your results from a different perspective was to “check to see if these [specialist] pitchers have worse results ON AVERAGE at 2-0.”  I even added that my examples “doesn’t prove anything of course—just the first 4 guys I looked at.”  I obviously don’t expect that the specialists actually do perform worse at 2-0, but if your theory is correct that’s what you should find.

I’m thinking you may not have seen the discussion of the Kovash/Levitt paper over at The Book Blog: http://www.insidethebook.com/ee/index.php/site/comments/game_theory_on_pitch_selection/.  It’s very long (because there is so much wrong with the paper), but you should look for Mike Fast’s data on FB and non-FB, based on 3 seasons, at comments 30, 33, 42, 50 and 64.  Basically, he finds similar run differentials to yours, but once the results are adjusted for hitter quality much of the gaps vanish.  The table at 42 gives the difference in hitter quality.  For example, at 2-0 the hitters seeing non-FB are better hitters (+.009 wOBA) than those seeing FBs.  I think once you account for hitter quality, and also for handedness—which will greatly magnify these differences in hitter quality—almost all your apparent FB/NFB differences will be accounted for. 

The biggest exceptions are the 2-0 and 3-0 counts, where your data indicates a much larger run gap than Mike shows, and much larger than hitter quality can explain.  I think a full analysis of these two counts would really need to account for RE factors, as well as platoon, hitter and pitcher quality.  The decision to throw breaking balls in these counts (which happens infrequently) is effectively a decision that you don’t mind walking this hitter, and we can’t evaluate that decision without knowing the context.

The other anomalous count is 3-2, where hitters do better on FBs even though the pool of hitters seeing FB is actually weaker.  Very interesting result, and maybe this is an inefficiency.  But I still wonder whether handedness and base/out factors can explain the gap.

Matt Swartz
11 years ago

Looking through the posts you mention, it seems that (a) my two-strike results held up (too many fastballs), and (b) the 0-strike results seemed to be smaller and evaporate when adjusted for hitter quality. But that was a higher scoring run environment and probably could explain why fastballs might have hurt more in 2-0 counts in 2009 than 2012. That fits my theory nicely, which is that optimization comes from learning by doing and copying (directly or just by listening to your coaches and catchers). Maybe some pitchers still have 2009 strategies in 2012.

My theory doesn’t say that pitchers who have suboptimal strategies should do worse in counts than themselves or other pitchers. It should say the opposite- failing to capitalize on a unique strength makes a pitcher’s relative performance (to himself or others) in a given count good but still fail to be as good as it could be.

I will try to bring handedeness and eventually base/out stuff into my data, but based on what you showed me in that thread, there’s already a good estimate of the magnitude of these adjustments and it won’t explain the results anyway.

studes
11 years ago

Guy, thanks a lot for posting that link.  I had forgotten about it.  For those interested in this subject, I suggest you read it as well as Matt’s presentation here.

Guy
11 years ago

“I will try to bring handedeness and eventually base/out stuff into my data, but based on what you showed me in that thread, there’s already a good estimate of the magnitude of these adjustments and it won’t explain the results anyway.”

Not at all.  Nothing in that thread adjusts for pitcher and hitter handedness.  Once you incorporate that, I think it will explain ALL of your results except the 2-0 and 3-0 data (assuming 2012 isn’t anomalous and your results are correct as reported), which really need to be viewed through a WPA or RE lens, and maybe the 3-2 count. 

And yes, your theory very much does require the offspeed specialists to do worse in certain counts that we would expect.  That is, if the average pitcher allows a wRC+ of 130 at 2-0, and these specialist pitchers are choosing a suboptimal pitch selection at 2-0 while most pitchers do not make this mistake, then the specialists should have a wRC+ > 130.  The only alternative is to believe these pitchers should for some reason outperform specifically at 2-0, which is magically offset by their poor pitch selection at this very same count (how convenient!).  And you seem to be saying that they should perform BETTER than average at 2-0 despite poor selection—how does that make sense?  This isn’t complicated, Matt, and I know from reading your work that your math skills dwarf my own, so it seems you are being willfully obtuse here.

(Note on the PPT:  on slide 24 you report the percent in K-zone as virtually the same on all 0-strike counts.  Yet the % of FB varies hugely over these 4 counts.  Perhaps you aren’t accounting for umpires’ larger zone as # of balls increases?)

Guy
11 years ago

To clarify, when I say a wRC+ of 130 at 2-0 I mean relative to a pitchers’ own personal RC level.  So if the average pitcher allows 30% more runs at 2-0 than he allows overall, we should expect the offspeed specialists to do worse than that.

Matt Swartz
11 years ago

Really don’t think I’m being willfully obtuse, though i’ll gladly take the compliment on my mathematics. Here’s the hole in your corrolary, though: batters take a lot of pitches in 2-0 counts versus some pitchers (e.g. Changeup specialists, whose fastballs batters fear are really changeups) but not others. This makes 2-0 fastballs very potent for these types of pitchers, but they don’t capitalize. It doesn’t make their changeups less effective. But they do better than pitchers who batters swing at more in 2-0 counts.

Matt Swartz
11 years ago

Guy, I did check my data many times, and I didn’t rig a theory to fit results. If you want to be sure of the theory predating the results, read my articles from December where I said that people with specialized out pitches would need to vary their pitch selection and were probably a source of inefficiency. I’ll try one other way of explaining why this is true: the reason copying other pitchers won’t work is when the relative strength of your pitches differs from other pitchers, not the absolute strength of all your pitches. I could have easily lumped by fastball frequency too.

Also, you’re still completely ignoring the fact that 2-strike counts can’t be explained by hitter quality because the run values on non-fastballs are lower. That’s what Mike had in that thread, too.

I’m glad to hear more research ideas, but I can’t just hit a button and splice the data another way when that’s not in my data set. I don’t have the 2011 data handy, just because you doubt my results. I appreciate the extra questions that let me explain things further or highlight work I did but didn’t make the talk, but at this point, you’ve just started making accusations about rigging theories to fit data and latching on to broadly similar results that Mike got in 2009, which seem to validate my 2-strike results and don’t invalidate my 0-strike results. I’m happy to correct myself when I find mistakes. Read through the slides and you’ll see a couple places where I mention my hypotheses from my December series was wrong. But implying I’m being disingenuous or sloppy is unwarranted and pointless.

BobDD
11 years ago

I wonder if the pro’s have similar attitudes (confidence issues) as amateurs.  I was a good hitter but did not play past high school.  What I remember about a 2-0 count was that there was no fear at all of a strike because it was still a hitters count after that.  I was more fully aggressive about getting a ball to hit hard rather than “defending” the strike zone thus taking less decision time about swinging.  Are MLB hitters too good to have those issues influence this?

Guy
11 years ago

Matt:  I take you at your word on when you developed your theories.  I was commenting not on the theories’ timing, but on whether they seem logically plausible (to me, they don’t) and select the most likely inefficient pitchers (clearly not).  But it’s probably not worthwhile to debate theories that seek to explain an inefficiency I don’t think actually exists. (I do wonder, though, whether these apparent inefficiencies are associated mainly with low FB%—which would also be consistent with the results you report.)

On 2-strikes, at 0-2 and 1-2 the non-fastball hitters are weaker, which does help explain your findings.  Furthermore, the run differences at 2 strikes are small enough that platoon effects could easily explain them (except perhaps at 3-2, as I mentioned).

I’m not saying your 2-0 and 3-0 data is wrong (unless of course you forgot to exclude IBB—actually, would probably be a good idea to just exclude all 4-pitch BBs).  But at this point I’m obviously going to give more weight to 3 years of data (Fast) than 1 (yours), so I’m going to assume his estimates of the run differential are closer to correct.  It’s obviously entirely up to you whether you incorporate more years of data in your future work.  But 1 year of data just isn’t very much when you consider how infrequently pitchers throw breaking balls on these counts.  Don’t get fooled by your superficially large Ns—a small number of pitchers likely account for a large share of these NFB samples.  And as we’ve discussed, the 2-0 and 3-0 counts aren’t really a question of pitch selection per se.  Throwing a breaking ball at 2-0 or 3-0 is a decision largely to largely concede a BB.  The choice of FB vs NFB is just a reflection of that prior decision, which is what you need to evaluate. 

At some point, you or another researcher will do a full, proper study of this, using several seasons of data, employing a RE framework, and controlling for hitter quality, pitcher quality, and handedness. And then we will all know the answer.  If large inefficiencies are found, driven by your “specialists,” I’ll happily pop by here, prostrate myself, and admit error.  But, I’m not too worried……

Guy
11 years ago

Matt: Let’s drop the issue of validating your theory by examining pitchers’ performance at their alleged suboptimal counts, since you clearly aren’t interested in it. In any case, it’s only a second method for checking your theory.  Simply doing the FB/NFB values correctly will eliminate most or all of the run differences anyway.

FYI, I noticed that Mike Fast’s data is very similar to yours in most counts.  However, his FB/NFB difference is much, much smaller at 2-0 (-0.5) and 3-0 (-0.6).  I obviously don’t know whether Mike made a mistake, you made a mistake, or 2012 is just anomalous.  But since those counts are by far your two most extreme results, and the n for NFBs is fairly small in both cells, I’d suggest you review those 2 results and perhaps look at other seasons. If Mike’s data are close to right, then the run gap in every cell but one is small enough to plausibly be explained by pitcher/hitter quality, platoon, and RE factors.

The one exception is 3-2, which is a very interesting count.  Perhaps pitchers throw FB much more to opposite-handed hitters at this count.  If not, then it does appear too many FB are thrown.

*

A final point:  I don’t see any good a priori basis for focusing on the top-10 CU, SL,CH pitchers, except for the fact that they turned out to have large apparent differentials.  If “rare skill” is what allows them to be inefficient, then you want to look at dominant pitchers, guys who are so good they can afford to be inefficient and still keep a job (e.g. the pitchers with highest FB velocity).  Or if “specialization” is your theory, these are actually among the pitchers with the most diverse (i.e. least specialized) repertoire in MLB.  The most specialized are the guys who throw 70% FB, but you don’t mention them.
In the end, you aren’t providing a plausible theory for why these pitchers would be less efficient than others—more of a jerry-rigged theory that apparently fits your data.

Guy
11 years ago

Studes:  You asked earlier why I’m so skeptical.  I think this research fails, ironically, to apply a Bayesian perspective.  What is our prior for MLB pitchers, after years of trial-and-error experience with pitch selection, making systematic errors? (Not an individual pitcher, which surely happens, but pitchers in aggregate)?  It has to be pretty low.  For a theory that offspeed specialists rely too much on their specialty pitch, we might say “that seems possible,” let’s say p=.10.  For the theory that specialists use their best pitch too rarely (even though it’s what made them successful and rich), we would be less generous, maybe p=.02?  But now what is our prior for Matt’s theory, that specialists use their best pitches too much in some counts, but too little in other counts (but about the right amount over all)?  I would say maybe p=.001.  In other words, we need a LOT more evidence before we should believe this.

Matt Swartz
11 years ago

Rest assured I’ve factored in tons of things, but there will always be more to unravel. There is no “final study” on something as fundamental and evolutionary as pitch selection.

If anybody still has any of Guy’s concerns, I’m happy to engage. Feel free to email me (my gmail is in the last slide) or tweet at me, etc. This exchange is going nowhere and I’m not sure if anybody is benefiting from this.