Compatiblity matching

The baseball statistics available freely online have advanced greatly over the past few years. As fantasy players, it is important to keep up with the research going on in the “real” aspect of baseball and continually try to apply it to fantasy baseball.

One of the leaders of the movement, FanGraphs, now has Pitch Type Linear Weights that show the pitches individual pitchers are better at throwing, and which pitches individual batters are better at hitting. (Click here for the accompanying explanation article by Dave Allen.) Taking at quick glance at Mark Teixeira’s FanGraphs player page shows the following information:

+-------------------------------------------------------------------------------------+ | Mark Teixeira Pitch Type Values | +--------+----------------+------+------+------+-----++-------+-------+-------+-------+ | Season | Team | wFB | wSL | wCB | wCH || wFB/C | wSL/C | wCB/C | wCH/C | +--------+----------------+------+------+------+-----++-------+-------+-------+-------+ | 2006 | Rangers | 20.8 | -0.7 | 0.8 | 5.6 || 1.24 | -0.36 | 0.32 | 1.05 | | 2007 | Rangers/Braves | 31 | 4.5 | 0.9 | 0.4 || 2.43 | 2.36 | 0.48 | 0.1 | | 2008 | Braves/Angels | 39 | 1.3 | 3.1 | 5.3 || 2.46 | 0.43 | 1.54 | 1.49 | | 2009 | Yankees | 24.3 | -1.5 | -1.1 | 0 || 2.5 | -1.02 | -0.63 | -0.01 | +--------+----------------+------+------+------+-----++-------+-------+-------+-------+

The numbers on the left of the divide show in total how many runs Teixeira has earned hitting fastballs, which for 2009 has been 24.3 runs. On the right side, where the column headers have a “/C” after them, the numbers show how many runs a player earns on a certain pitch per 100 pitches. For Mark Teixeira in 2009, that number is 2.5 runs per 100 fastballs.

image
Casey Blake connecting for a home run on what was most likely either a fastball or change-up. (Icon/SMI)

For these stats I believe 0 is average, positive means a player is good at hitting that pitch, and negative means the batter struggles with the pitch. Messing around with the leaderboards will help give you a relative context of how good a 2.5 wFB/C is.

Although extremely interesting, keep in mind the numbers are not perfect. The pitch classifications FanGraphs uses are not manually adjusted and no tests have been done to my knowledge on the stability of the pitch value numbers. Also the methodology behind the values is still somewhat of a work in progress, but nevertheless they can still used for fantasy purposes.

Most simply, when deciding which batter to start of two, besides looking just at the skill and handedness of the opposing pitchers, you can also check out the pitches they are better at throwing and the pitches your batters hit better.

For example, let’s say you own Cody Ross who has pretty consistently hit change-ups well throughout his career. Some nights he starts for your team and other nights he sits one out. Let’s say tonight the Marlins are playing the Astros and Wandy Rodriguez is pitching. Taking a look at his Pitch Value numbers, he historically has a below-average change-up and still throws it somewhat often at 10 percent of the time.

Tonight, then, would be a time to make sure Ross is in your starting lineup because of the increased possibility of him pounding one of Wandy’s change-ups out of the yard, or at least what FanGraphs—as provided by BIS—is classifying as a change-up.

In terms of importance, I would rank this below matching up handedness and skill of the opposing pitcher, simply because I do not really know how effective mixing and matching batters to pitchers by individual pitch is. Unfortunately, I did not invest the time yet to find out, so for today that question will be left unanswered.

Instead I’ll leave you with what is possibly new idea and if it’s not new, feel free to tell me in the comments how you have been using it.


11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Paul Singman
14 years ago

Yea I tried to stress in the article not to weigh the matchups too heavily. This article just scratched the surface, and I would like to look into exactly how much better you can expect a fastball hitter to hit off a pitcher with a worse fastball compare to one with a better one.

Troy Patterson
14 years ago

I would worry about using this to much, but it is a good idea.  How many teams might use data like this and as part of the scouting report tell their pitchers not to use certain pitches as much versus certain hitters.

Jeremy Boyd
14 years ago

Its possible to do this with pitchers too, by looking at the pitch type linear weights on FanGraphs for the opposing team overall (batters), and comparing them to the pitch type linear weights for your pitcher.

What I’m trying to figure out is how to weight (a) pitch type linear weights, relative to (b) park factors, (c) pitcher skill, (d) opposing team skill.

A good case study is last night.  My league has two SP spots and I had to choose two of J. Vazquez, Y. Gallardo, M. Garza, and J. Santana. 

Vazquez was a no-brainer pitching against San Diego at Petco.  I had a hard time choosing between the remaining three, and unfortunately weighted park factors too high and went with Gallardo vs. LAD at Dodger Stadium over Garza vs. Boston at the Trop or Santana vs. St. Louis at Citi.

Linear weights would have selected Garza, who has a dominant slider and curve ball, and Boston hitters have not fared too well against sliders and curve balls, particularly in the past month.  Gallardo did not match up well against LAD in linear weights.  In retrospect, Vazquez and Garza were the best choices, and Gallardo was the worst choice, as linear weight matching would have predicted.

Troy Patterson
14 years ago

Jeremy,

I think you are looking at much to small of a sample size.  One month of pitch values is not enough to judge a team.  Park factors and better pitching should always come first.  As long as Santana has a K/BB over 3 and is pitching in the pitchers park of Citi Field he is the best choice by far.

You can say I was wrong since he gave up 5 runs and 2 homers, but that’s the danger of trying to project single game results.  At this stage there is no way you can’t pick Vazquez and Santana every time they throw.

Jeremy Boyd
14 years ago

Troy,

Good point about K/BB, but the evidence shows Citi is a neutral park, not a pitcher’s park. 

K/BB selects Santana over Garza or Gallardo, but Park Factors selects Dodger Stadium, a pitcher’s park, over Tropicanna Field, and Citi Field, which are both neutral parks.

Which brings me back to my original question about which factors to weight highest.  Using K/BB and Park Factors, I eliminate Garza, but I still can’t decide between Santana and Gallardo based on K/BB and Park Factors alone, unless I weight one higher than the other.

digglahhh
14 years ago

Perhaps this is an example of paralysis by analysis, or of people overthinking things. These are very real dangers of the virtually infinite tool kit we have at our fingertips nowadays. We must always remember that we have to use these tools for our benefit and not to simply create new, more esoteric, dilemas. (Don’t mean to sound like Joe Morgan here, but sometimes I draw the line.)

Quite simply, when you draft or buy Johan Santana you are paying a premium. Part of the reason for that cost is that you are going to start him every start. You’re paying top dollar for each time he pitches, so get your money’s worth.

Without knowing too much about your league, if you can only start 2 SPs, perhaps you shouldn’t invest as highly in your pitching staff. You’re going to consistently leaves quality innings on the table because you’re not going to get all the production you pay for, as your pitchers won’t make all of their starts due to scheduling issues. So, in essence, you are paying for 35 starts from Matt Garza, but maybe only using 29 of them. That lead structure probably tends toward rewarding the “stars and scrubs” approach.

As a more general comment, I strongly oppose SP and RP distinctions in leagues. These are artificial distinctions. There are now restrictions as to how a major league team must construct its pitching staff, or use its pitchers, so why should there be in fantasy. (A team can throw nine guys one inning each if it chooses). There’s no such position as “starting pitcher,” it’s just pitcher. I mean, it would be rather perposterous to cap the number of lead-off or clean-up hitters your offense starts, so why do so with pitching?

digglahhh
14 years ago

Sorry for the horrific grammar/typos in the post above – sometimes a casualty of posting from work…

Jeremy Boyd
14 years ago

Thanks digglahhh,

You and Troy both made good points, IMO.  I’m a statistical programmer by trade, so my interest in this is partially motiviated by trying to develop a statistical model to make these decisions for me. 

Because you are right, I did hit the point of analysis paralysis regarding which of the 4 SP to start yesterday.  Statistical models are great panaceas for analysis paralysis, because of their ability to objectively sift through a lot of information.  And I oppose SP RP distinctions too.

On a different note, I don’t want to focus on price, investment, premiums, etc.., exclusively, or set my lineup by them, because considering investment without also considering returns could lead down a wrong path.  True I did pay a premium for Santana, but that said, my ROI for Santana is lower than for Gallardo who I picked up as a FA, or Garza, who I acquired as a throw-in with Beckett in a trade.

Thanks again,

Paul Singman
14 years ago

I’ll jump in here seeing how it is my article.

Jeremy, picking between players of that caliber should not happen often, I envisioned it being used more for fring-ish players like Cody Ross (who is getting hot right now, btw).

As I say in the article, I would weigh these pitch type values the least because no one has really tested how much better you can expect a slider pitcher to do against a team that ranks one of the worst at hitting sliders. That’s not even mentioning the pitch classification issue with this data.

In the future, though, this could become a valuable mix-n-match tool.

digglahhh
14 years ago

Jeremy,

Yes, but once the investment is made (the roster is filled) ROI becomes meaningless. You are accruing actual production, not production per dollar, or per ADP. So, you want to put in the best guys you have. The more mid-level assets you have, the more difficult decisions you will have to make, and the greater proportion of your initial investment you likely leave on the table over the course of season (regardless of whether than investment was above or below fair market value).

In other words, this didn’t become a tough decision because you have Johan Santana, a tier 1A pitcher, who on a given night here or there may actually (at least seem to) be comparable to a specific tier 2 (or 2B) pitcher because of respective context.  These decisions arise because you have a number of mid-level asset pitchers. And, if you are sitting a guy like Garza with any regularity (or sitting Johan so that you can start Garza) then you are leaving a lot of quality production on the bench, regardless of what you “paid” for it. Also, you are cruising for frustration because nobody makes the correct decision all the time, and you will inevitably have nights where you make poor decisions and see gems pitched on your bench and clunkers in your line-up. So, my point is to avoid this, it may behoove you to consolidate your assets, maybe in the form of foreign currency — bats.

One of the reasons why a lot of “experts” don’t blow high picks on pitchers isn’t necessarily that pitchers aren’t worth their price tag, but that you are often better able to identify and scoop high ROI pitchers later in the draft. If you spent that pick/money on Ryan Braun instead of Johan Santana, you wouldn’t be asking yourself if you should bench him to start [fill in batter analogous to Gallardo]. Or, conversely, if you packaged Vazquez and one of the other two for Lincecum, you wouldn’t have many tough decisions to make either. It’s easier to maximize the value of marginal pitchers than it is to do so for bats.

Depth is great for versatility and to protect against injuries, but anything not in your starting line-up doesn’t count.

Again, that rule would drive me crazy. In one of my leagues, I got into this rut where it seems like my whole staff pitches on two days. Because of rainouts, injuries, etc., you can’t control that, so it would be frustrating to have to make that decision over and over. Not to mention, as I said, you’d be leaving production on your bench regularly.

Most of this discussion is academic, but I will say one more thing. As a general rule of thumb, when faced with a difficult decision I try to make the choice I’d be most comfortable failing with. That means I usually pick the better overall player. It’s easier to stomach Johan having an off night, then getting burned by getting too cute and betting against him in favor of a lesser pitcher. In some cases, I can achieve peace of mind by making the sound, statistical play, but not when it comes to benching arguably the best pitcher in baseball.

Jeremy Boyd
14 years ago

digglahhh,

Thanks for the great advice.  In retrospect, Gallardo vs. the best team in baseball was a bad bet.  The pitch type linear weights appear to have corroborated this.

Garza vs. a team he’s pitched well against all year was a probably a good bet.  It looks like pitch type linear weight matching gave a similar answer. 

I agree that Johan was the best pitcher in baseball, for the first two months of the season, but I exhausted my failure comfort with Santana back in June.  His earned runs per game are highly variable since June, pitching shutouts or blowouts.  Its hard for me to argue that he should not have been benched on the blowout days, no matter how much his rookie card is worth.  Perhaps pitch type linear weights can help me improve prediction of Johan shutouts vs. Johan blowouts, and act acordingly.

Thanks again…,