Does it pay to play the match-ups with your SPs?

image
Jamie Moyer may be well past his prime and an afterthought in most leagues, but he can make for a great start in certain situations.(Icon/SMI)

Over the past week or so, there has been a lot of talk around the fantasy baseball world about whether or not it makes sense to play the match-ups with our starting pitchers. Does it do more harm than good? Is micromanaging worth the effort? Does it pay to sit a decent starter against the Yankees and play a poor one against the Astros? While the answer to this last question may seem like an obvious “yes” to some, others don’t seem to be convinced.

Here’s an excerpt from a recent Tim Dierkes post at RotoAuthority:

I have waffled over the years as to whether it makes sense to bench your starting pitchers occasionally if they’re facing tough offenses. I always seem to guess wrong. Tom Gorzelanny against the Pirates, that’s a must-start. But Gorzelanny in Citizens Bank against the best offense in the NL, especially against lefties—I’ll sit him. The result: I’ve danced around Gorzelanny’s best starts.

The philosophy I hope to abide by: If he’s good enough to be a permanent part of your roster, he should be active for all starts.

In his most recent newsletter, Baseball HQ’s Ron Shandler said this:

As much as we hate to admit it, doing match-up analyses has about the same rate of accuracy as tossing spaghetti. No matter what you do, you are going to have to weather occasional meltdowns and sterling performances from the bench.

Maybe the answer is simple. When it comes to aggravation in this game, we just have to live with it.

On the other side of the coin is RotoWire’s Chris Liss, who benched Ricky Romero in our CardRunners AL league this week due to match-ups:

Eric [Kesselman (co-commissioner of the league)] thought this was a curious decision and asked me to write about it. The short answer is that it was a gut call.

The longer answer is that [it was] based on the matchups…

Despite the fact that Chris and I landed on opposites sides of the Quants vs. Intuition debate, I’m on his side here. Read on.

The study

Tim Dierkes conducted a mini-study shortly after the above-quoted post and concluded that “if you are able to identify the ‘marginal’ starters correctly, as well as the offenses that will be the best all year, there is a small gain to be had over the long run. Season to season, with probably no more than two marginal guys on your regular roster, you’d probably have a lot of years where you wished you hadn’t benched any starters.”

I’m going to a run a study that goes a little more in-depth and comes to a different conclusion. The main question I wanted to answer was “Do pitchers perform better against poor offenses and worse against good offenses?

My study took data from 2004 to 2009 and compared how pitchers performed against good offenses and bad offenses. For the purposes of this study, “good offenses” are defined as the top four teams in year-end runs scored in each league (AL and NL). “Bad offenses” are the four lowest-scoring teams in each league.*

I then looked at all pitchers who faced at least one good offense and one bad offense and compared their starts against these teams in our four standard roto categories (W, ERA, WHIP, and K), weighted by the least of his starts vs. good offenses and starts vs. bad offenses.

*There are certainly problems here, as year-end stats don’t perfectly reflect our in-season opinions about teams, but I think it will be close enough to let us examine this match-up dilemma. Last year, for example, saw the Yankees, Angels, Red Sox, Twins, Phillies, Rockies, Brewers, and Dodgers make this list. Most of these are teams that were expected to be pretty darn good offensively.

A Hardball Times Update
Goodbye for now.

Results

The results of the study are below, showing the advantage of facing a bad offense over a good one.

+--------+---------+--------+--------+--------+
| IP     | Win/GS  | ERA    | WHIP   | K      |
+--------+---------+--------+--------+--------+
| + 0.34 | + 9.66% | - 1.10 | - 0.17 | + 1.12 |
+--------+---------+--------+--------+--------+

As you can see, there is a significant advantage to facing a poor offense over a good one. All else equal, if your starter gets to face the Astros instead of the Phillies, he’ll stay in the game for an extra out, strike out an extra batter, win an extra game every 10 starts, and have an ERA a full run lower. That’s a highly significant difference. It means that if you’re starting Ross Ohlendorf against the Indians, you might as well be starting Cole Hamels against an average opponent.

Now, of course, we must consider that this study knows who the good and bad offenses will end up being in any given year. In June, we don’t know with that kind of certainty who the best and worst offenses are. While we might not gain that full run ERA difference by playing match-ups (or streaming), I do think we’ll be close. After all, there’s a very high probability that teams like the Yankees, Red Sox and Rays are actually good offenses and the Pirates, Astros and Orioles are actually poor offenses. We just need to be selective. And the deeper we go into the season, the more certain we can be about playing match-ups.

Finally, we must realize that this study deals with the extremes. We’re not always going to be faced with the decision of Ian Snell against the Rays or Joe Saunders against the Mariners (where we’d obviously take Saunders). Mediocre teams will be in the mix, and decisions will be made a little tougher. The important thing to remember is that everything should be taken within proper context and all situations analyzed individually.

Wrapping up

There are, of course, other things to consider when deciding whether to insert a pitcher into your active lineup (ballpark, weather, home/away, opposing pitcher, etc), but there should no longer be any question whether there’s an advantage to playing match-ups. There is. And honestly, isn’t that the logical answer? Shouldn’t we expect that pitchers perform better against poor teams?

Sure, occasionally you’ll end up with Dallas Braden perfect-gaming the Rays or Brett Anderson giving up six runs to the Orioles. That’s the nature of small sample sizes. It’s no different than Albert Pujols going a week or two without a home run. And if that happens, we’re not suddenly going to declare chasing power a fool’s errand, are we? As with all small samples, extreme random variation is a possibility, but in the long run, things even out. In the long run, you’re far better off playing the match-ups. They won’t all work out as expected, but when you add them all up at the end of the year, you’ll come out ahead.

Concluding thoughts

My main point can be summed up very easily: play the match-ups! While some may be convinced that it’s a crap shoot, it’s not. In the movie Rounders, Matt Damon’s character muses about a similar phenomenon:

In Confessions of a Winning Poker Player, Jack King said, “Few players recall big pots they have won, strange as it seems, but every player can remember with remarkable accuracy the outstanding tough beats of his career.” It seems true to me, cause walking in here, I can hardly remember how I built my bankroll, but I can’t stop thinking of how I lost it.

It’s easy to recall the time a spot start blew up in your face, but the marginally good match-ups are easily forgotten. They all count, though, and in the long run, playing the odds is the way to go.

Finally, I ask that you please not comment to tell me that it’s obvious that a pitcher does better facing a poor offense. To me (and likely to many others), it is obvious, but when other analysts—particularly ones as well-known as Ron Shandler—are doubting it, I thought it best to put some concrete numbers out there.


19 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Oscar
13 years ago

What are your thoughts on the following scenario:

Josh Johnson @ Philly tomorrow vs Halladay

Of course, Johnson is a beast but is it worth the risk to start him against a good offense when the win probability is low?  I started him against Philly his last outing since Philly was slumping and it was in Florida but even though he only gave up the unearned run it was a loss thanks to the perfect game.  Do you think it’s foolish to even consider benching him at Philly vs Halladay?  Even with his last great outing against Philly, Johnson is a career 3.77 ERA 1.44 WHIP against them.

JB (the original)
13 years ago

You don’t mention it, so can I assume that your league does not use Quality Starts as a category?  If your league did, how would that alter (if at all) your methodology?  Would you still play the same match-ups, maybe play more great pitcher against good teams?  Only go with high “K” guys in more marginal match-up choices to at least better the odds in other categories?

buck turgidson
13 years ago

I play the matchups when it comes to borderline starters.  But elite pitchers need to be in the lineup every start.  Even if they have been faltering.

D.Diaz
13 years ago

Love the site, loooove this topic.

I totally fall under and advocate the match up camp.  If you are truly on top of all relevant stats and related factors, you can have fairly good success ratio with matchups.  Three years ago I took over an expansion team in a loooong running keeper league.  Team had absolutely zero pitching, and good hitting.  I spot started my way to top three ERAs and WHIPS and made it to the finals (lost – dammit).  It was without a doubt the most enjoyable fantasy baseball experience I’ve ever had, more so then when I’ve won leagues.

To add a bit to the above, a strategy I utilized is to limit the spot starts in the early part of the season, and try not to focus too much on worst hitting team statistics during April/May.  You don’t really know who is truly going to be the worst hitting teams and match ups during the early going.  I would focus more so on pitcher/ballpark/home-away factors.  After a month or so, then you can make some solid guesses on the team hitting factors. 

You can really stat crunch if you are willing to put forth the time, but either way can have decent success just focusing on the main components of poor hitting ballclubs/home-road/ballpark/pitcher. 

Based on the above experience three years ago, I have completely adpated my draft strategy to reflect this, in that I will draft extremely heavy on hitting, draft 4 excellent level SPs, then work the last one of two starting pitching spots (dependent on the league) on a per week basis, based on matchups.  I’ve had nothing but success and positive results.

With all the above said, this all requires time.  You have to love this stuff and spend the time to do the research, if you want to have success.

Chris
13 years ago

I sat Masterson vs the Yankees figuring that the Yankees are lefty heavy (with lots of switch hitters) and in New York with that short porch in RF Masterson would be forced to commit ritual suicide after the game. He pitched 6.2 IP 7 H, 1 BB, 3 ER, 8 K’s and got the ND. a 4.05 ERA isn’t stellar, but man I could have used those K’s and that 1.20 WHIP. I just can’t seem to wrap my head around it even now that he was affective and got a QS vs the Yankees in New York given his splits.

To add further insult, I started him vs the White Sox where he went 4.0 IP 9 H, 2 BB, 5 ER, 0 K’s.

I now resolve to deal with the bad starts to get the occasional gem that I can’t seem to predict from him. The rest of my rotation with Verlander, Lester, Buccholz, Gio Gonzalez and Medlen should hopefully help ease the suffering when he gets lit up.

Alex
13 years ago

I’m totally in the ‘play the matchups’ camp, but there’s something in your results that doesn’t past the ‘smell test’.  If our guys are getting an extra 3 innings pitched against a worse offense, how they heck are they only striking out one additional batter?  Even aside from that, the extra 2.8 innings sounds fishy.  I doubt the average start averages less than 5 innings pitched against the good offenses.  That would mean that they’re averaging almost 8 innings pitched against the bad offenses.  That sounds implausible.  I suspect something is wrong with that IP number…maybe it should be a percentage, or maybe it’s just plain wrong.

Klatz
13 years ago

You need to put in the confidence intervals.  Even if on average you get an advantage if there’s so much variation it’s not going to matter with the sample size you’re talking about.  In a single season, it’s going to be about 10-15 starts that are against poor offenses.  Even then you might limit the benching to the poorest of the poor.

Derek Carty
13 years ago

Don’t I feel silly.  Thanks, Alex, for pointing that out.  I forgot to use a denominator for IP.  It should be roughly 1/3 of an inning extra for starts against bad opponents.  Everything else stays the same.  It’s been fixed in the article.  I originally assumed that maybe you only got 4 IP vs good opponents and 7 IP vs bad opponents or something like that.

Klatz,
I’ll look into getting some confidence intervals later tonight.  I’m not sure we’re dealing with just 10-15 starts, though.  If you designate 6 spots on your team for SPs and get 32 starts out of each, you’re looking at over 50 total starts against poor opponents and over 50 total starts against good opponents.  Still, I will put together the confidence intervals because I should have them.

Derek Carty
13 years ago

Oscar,
Unless you’re in a really shallow league, I’d play him.

JB (the original),
Quality Start per Game Started percentage would get a + 10.8 on the chart in the article.

Chris,
That kind of thinking is exactly what I was trying to discourage with this article.  Sure, over a two start sample you can see weird results, but if you are making 100 start/sit calls in a year, in the long run you’re going to be better off making them than simply riding it out.

phil
13 years ago

i sat cliff lee in texas last nite based on historical numbers, and that one bite me in the behind…I guess play all your top tier guys in any situation, and spot start everyone else, that’s my strategery

Toffer
13 years ago

I think a simplified log5 calculation would work decently to figure out whether to start or sit a player. If you divide an opponent’s wOBA by league average and then multiply by your starter’s ERA I think you should get a rough approximation of the starter’s expected ERA.

For example, James Shields’ ZiPS RoS projection is 3.95, the Yankees current wOBA is .359 (ideally we would use a projected wOBA rather than season to date but this will do)  and league average is ~.333. (359/333)*3.95 = 4.26. I haven’t done all of the math but I would be surprised if this approximation is too far off real results. Once you have this projection now you just need to decide if you like a 4.26 ERA.

You should be able to do the same thing to figure out K/9, WHIP, opponent’s expected runs and thus W% using Pythagorean record, etc.

RotoScoop.com
13 years ago

A real quick and dirty way to help make these decisions is to simply use the Vegas odds on a given game (it’s also helpful to look at the O/U too). But yeah, Dierkes’ stance on this sort of baffled me when I read it. Nice to see some real data to back up my innate thoughts (and actions).

3FingersBrown
13 years ago

Great article Derek. I play matchups more often in h2h rather than roto, since the week’s scoring situation often dictates my strategy. I have played matchups more in my roto league this season however, perhaps because I feel I’ve gotten better at identifying ‘good’ and ‘bad’ starts.

Like others have said, I usually will throw my top starters against just about anyone, but keep my last two pitching spots for streaming spot starters based on matchups.

I picked up Hammel this week in hopes of getting a win against Houston but figuring I’d probably bench him at Toronto. Worked out well so far, but as you said we tend to remember the ones that don’t – Takahashi @ SD, sat Pelfrey at home against Yanks. I don’t just own Mets pitchers, those are just the ones I remember off the top of my head.

@RotoScoop: Good point about the Vegas odds.

Eric Kesselman
13 years ago

For the record, my point with Chris Liss’s benching of Romero wasn’t that it was wrong to play match ups, but that it surprised me that he didn’t heavily shop Romero around before doing so. I don’t like to see value wasted on the bench.

Also, the Cr league uses two week long periods largely to discourage excessive decision making based on match ups.

Derek Ambrosino
13 years ago

Nice piece, Derek.

I assume handedness could be a relevant consideration too. For example, Philly may be a worse match-up for a B-level righty than a C-level lefty.

Derek Carty
13 years ago

Alright, Klatz, I’ve got some intervals here, based on this methodology: http://en.wikipedia.org/wiki/Weighted_mean#Weighted_sample_variance

+———+———-+————-+————————————-+
| Stat | StDev | Mean   | 95% Confidence Interval |
+———+———-+————-+————————————-+
| IP   |  0.12 | + 0.34 |      + 0.23 to + 0.46 |
| W%  |  3.7% | + 9.66% |    + 5.94% to + 13.38% |
| ERA |  0.61 | – 1.10 |      – 1.71 to – 0.49 |
| WHIP |  0.08 | – 0.17 |      – 0.24 to – 0.09 |
| K   |  2.19 | + 1.12 |      – 1.07 to + 3.31 |
| QS%  |  3.8% | + 10.8% |    + 6.99% to + 14.63% |
+———+———-+————-+————————————-+</pre>

Ron Shandler
13 years ago

Derek… Interesting analysis, but I have a few comments. First, could you validate the following two assessments and comment on them?

1. According to your study, 47% of the time, our starters will be facing teams that are not among the top 4 or bottom 4 in offense. So nearly half the time, the percentage play for playing matchups would be far more dubious.

2. At the time when we make these match-up decisions (particularly early in the season, but really any time works), we do not know which teams are among the top or bottom offensively, and those rankings may be fluid all season long.

FWIW, I would never advocate sitting Ubaldo Jimenez, even against a top offense like the Phils (oh, wait…) or playing a struggling Wandy Rodriguez against Texas (he says with his fingers crossed for this week), so it’s always a matter of context. But the vast middle ground is where we all play most of our games.

And just so you know, I do occasionally use hyberbole to make a point. Reason being, I see so many people obsess over decisions like this and curse the gods (and us analysts) when a slam-dunk matchup fails. But in a season when Carlos Silva and Fausto Carmona emerge as fantasy forces, I think it is important to keep the level of control we think we have in proper perspective.

Be well,
Ron

Derek Carty
13 years ago

Thanks for commenting, Ron.  You’re absolutely right on both points.

1) I made note of this in the final paragraph of the “Results” section:

“Finally, we must realize that this study deals with the extremes. We’re not always going to be faced with the decision of Ian Snell against the Rays or Joe Saunders against the Mariners (where we’d obviously take Saunders). Mediocre teams will be in the mix, and decisions will be made a little tougher. The important thing to remember is that everything should be taken within proper context and all situations analyzed individually.”

In many cases, playing the matchups will only provide a marginal edge at best.  But in competitive leagues where every little bit counts, I think it’s worth taking into consideration.

As an example, let’s say we’re in a 12-team mixed league and we have a guy like Dallas Braden.  Let’s assume we see him as a 4.15 ERA guy the rest of the way and next week he’s facing a team like the Twins – not an elite offense but an above average one.  I don’t know exactly how much facing the Twins would raise his ERA, but let’s say it’s 0.15 points.

In a 12-team mixed league, Braden is ownable but not someone you’d love to have, and pushing his ERA up to 4.30 makes him worth sitting.  And then when he plays the A’s the next week, maybe we’ll expect a 4.00 ERA and play him.  I think making a lot of decisions like these over the course of a season – even if they aren’t huge on their own – adds up.  My opinion.  Of course, this can be confounded a bit by your second point…

2) Absolutely right here too, Ron.  I made note of this also in the second to last paragraph of the “Results” section:

“Now, of course, we must consider that this study knows who the good and bad offenses will end up being in any given year. In June, we don’t know with that kind of certainty who the best and worst offenses are. While we might not gain that full run ERA difference by playing match-ups (or streaming), I do think we’ll be close. After all, there’s a very high probability that teams like the Yankees, Red Sox and Rays are actually good offenses and the Pirates, Astros and Orioles are actually poor offenses. We just need to be selective. And the deeper we go into the season, the more certain we can be about playing match-ups.”

Early in the season I would might only seek to play the matchups against teams that we can be relatively certain are good/bad – the Yankees, Red Sox, Rays, Astros, Pirates, Giants, Orioles, etc.  But as the season goes on, we can be more certain about playing matchups, especially once we get into the second half of the year.

It’s also important to note that this study doesn’t presume to know with absolute certainty who the best teams truly are.  It doesn’t use projections or true talent levels or anything like that (which may cause us to observe an increase in the effect), simply end-of-year numbers.  So everything that actually happens directly contributes to those end of season numbers, and the closer we get to the end of the season, the more certain we can be about teams we might not be completely certain about (maybe the Reds or Braves or Phillies).  In the mean-time, it would also be useful to take in-season projections into consideration.

Absolutely agree that context is everything (“The important thing to remember is that everything should be taken within proper context and all situations analyzed individually.”).  Also agree that the middle ground is where we play most of our games.  But when things are hazy, I think this can be a useful tool to help us make sense of things and gain edge.  In some instances it will be moot, but in the instances where it isn’t, it should be utilized.

Thanks again for commenting, Ron, and it’s good to hear where you’re coming from and your approach to writing (in regard to hyperbole and sure-things backfiring).  I agree that us analysts can find ourselves being blamed for what comes down to the vagaries of chance.  I think it’s important that readers understand that chance does play a large role in this game and that the best we can do is make informed decisions with the information we have, to play the percentages.  Over the long-run, playing matchups is one way in which we can play the percentages to gain edge.  Sometimes we’ll have a sure-thing backfire while what should have been a disaster turns out to be a gem, but we must take solace in knowing that we made the right decision with the information available and that things will even out when all is said and done.

Finally, I know I’ve commented on things you’ve written twice in the past couple weeks.  I don’t mean this as a shot at you in any way, and I hope it hasn’t been taken that way.  I have the utmost respect for you, Ron, and for what you’ve done for the fantasy industry.  It’s that I don’t like to use strawman arguments, making it necessary to quote someone.  In this instance, it was actually Tim Dierkes’s article that got me thinking about running a study, and then you commented on the same topic not long after.

Jason B
13 years ago

Derek—

I’ve been trying to play the same matchup game with my OF/util slots in a mixed league; we only play 3 OF and 1 util, so I’ve kept Ethier and Justin Upton in two of the OF slots and have been rotating Jay Bruce, Brad Hawpe, Garrett Jones, and/or Adam LaRoche among the other OF and UT slot.  Keeping a careful eye on home/road and L/R splits in trying to set those slots, as well as the quality of the opposing pitcher.

After a month of toying and tinkering, it’s been painfully disastrous and is giving me (even more) gray hair before my time.  On the plus side, my bench leads the known universe in HR/RBI/OPS.  So I’ve got that going for me, which is nice.