Monday, January 25, 2010
Introducing: xW, xBABIP, xLOB%, xHR/FB, and morePosted by Derek Carty at 2:00am
Alright guys, prepare yourselves for stat overload. I'm about to introduce 12 new stats that will help us better understand pitchers. Now I know that 12 sounds like a lot, but don't worry too much — most of them are related to stats you're already familiar with and/or come in pairs. I'll explain everything as plainly as I can (while leaving enough of the guts in there for people who care), and if you have any questions, I'll be happy to answer them.
Where we're at
As it stands now, many fantasy analysts are starting to make use of stats like BABIP and HR/FB, but the analysis often goes something like, "Player A has a .315 BABIP and 15% HR/FB. He is getting unlucky and both will regress toward league average." There's nothing wrong with this — it will usually be correct — but I think we can take things a little further. After all, not all pitchers should be expected to post an exactly league average BABIP or HR/FB or LOB%.
For example, a pitcher's home park will effect his HR/FB, so a guy throwing half his games in Coors should be expected to have a higher HR/FB than a guy who plays in PETCO. We also know that ground balls become hits at a higher rate than fly balls, so groundball pitchers should be expected to have a higher BABIP than flyball pitchers. These are the kinds of things that my new stats will try to account for. So, without further ado, here they are:
12 new pitching stats
xBABIP: While we often say that a pitcher has little control over his BABIP — and this is true — they do not relinquish all control. Most importantly, we know that a pitcher has a lot of control over his groundballs and flyballs, a good amount of control over his pop-ups, and little control over his line drives. To calculate xBABIP, we first neutralize line-drive rate and adjust the other three rates accordingly (like we do to calculate xGB%). Then we assume a league average rate of hits on all types of batted balls. Add up those hits, and we can calculate an expected BABIP.
What we'll see is that extreme GB pitchers have higher xBABIPs and extreme FB pitchers have lower xBABIPs (while also realizing that guys who induce a lot of pop-ups will have low xBABIPs too). This past season, for example, GB'er Aaron Cook had a .314 xBABIP while FB'er Jered Weaver had a .291 xBABIP.
xHR/FB: This is calculated very simply by using park factors. We assume a 50/50 home/road split for the pitcher, a neutral road schedule (HR/FB park factor of 1.00), and account for the pitcher's home ballpark's HR tendencies. It is very important to note, as I have in the past, that even if a pitcher calls an extreme HR park home, his expected HR/FB will still remain pretty close to neutral. The xHR/FB for Rockies pitchers, for example, was just 12.39 percent in 2009 (with a league average of 11.18 percent).
Analysts often like to credit deviation further from the mean than this to a pitcher's home park, but that simply is not the case (unless the pitcher has thrown a disproportionate number of games at home, and even if he has, that shouldn't be expected to continue going forward). Simply put, HR park factors are not quite as extreme as most seem to believe.
xLOB%: Of the three main 'luck indicators,' LOB% has the most room for skill-based variation. This is because LOB% is actually an exponential function. To put it simply, if Pitcher A allows hits at a 24 percent rate and Pitcher B allows hits at a 30 percent rate, once men reach base, more of them will score on Pitcher B because he is more likely to give up hits to begin with. His hits will be clumped closer together. As such, LOB% has a fairly strong relationship with the rate at which batters reach base.
xLOB% is calculated using a regression formula derived from BAA and BB%. Now, of course, BAA is subject to extreme variation since it is largely comprised of BABIP. So instead of using actual BAA, we use xBAA, which accounts for the pitcher's actual K rate (as with hitters, the more Ks, the fewer opportunities for hits) and his xBABIP. What we end up seeing is that good pitchers end up leaving more runners on base (Tim Lincecum: 75.6 percent) while bad pitchers let more score (Jeremy Sowers: 68.1 percent) than league average (71.9 percent).
R/HR and xR/HR: HR/FB has become a common stat for measuring a pitcher's luck with home runs, but it doesn't tell us everything. For example, a pitcher can have a seemingly lucky 4 percent HR/FB but could actually have experienced bad luck with HRs if he was unfortunate enough to have given up all of his HRs while the bases are loaded. On average, about 1.4 runs score per HR, but not all pitchers allow them at this rate (some justifiably, some as a result of luck). R/HR tells us how many runs actually scored per home run allowed while xR/HR tells us how many runs should have scored (the process for this is a little complicated, but I'd be happy to explain for anyone interested).
Home Run Runs per Fly Ball (HRR/FB) and expected Home Run Runs per Fly Ball(xHRR/FB): Absolutely my favorite of this new crop of stats. A mixture of HR/FB and R/HR, HRR/FB tells us how many runs scored on home runs per outfield fly. xHRR/FB, naturally, tells us how many should have scored. You can consider this a super-powered HR/FB since it not only accounts for how many HRs are allowed but also the total damage done by the HRs, which is what truly matters. Ten solo home runs do just as much damage as five two-run homers, which is something HR/FB doesn't capture on its own.
Run Support (RS) and xRun Support (xRS): These two stats are just what they sound like. Run Support is the number of runs that a starting pitcher's offense scores in games that he pitches. xRun Support is the number of runs per game the pitcher's team scores in all games during a season. Since pitchers have little influence over how well their offense performs in games that they pitch, we should expect the offense to perform at its usual level each time the pitcher takes the mound.
Bullpen Support (BS) and xBullpen Support (xBS): Very similar to the Run Support stats. BS measures how well the pitcher's bullpen performs in games he pitches and xBS measures the bullpen's performance during all games.
xWins (xW): While many fantasy analysts call Wins a fickle stat — and they're right — they aren't wholly unpredictable. Axioms like "don't chase wins" or "draft skills" are thrown around often, and while one can be successful by simply following this advice, I feel as though we can do a little bit better. And if we can do better, why shouldn't we?
Essentially, xW uses Bill James's Pythagorean Theorem to estimate the expected number of games a pitcher should have won. Using this formula, I plug in the pitcher's LIPS RA (weighted by his IP per game), his xBS (weighted by the IP the starter doesn't pitch per game), and his xRS.
This gives us the number of games the SP's team will win on days he pitches, and from there we calculate the percentage of those games he should get credited for the Win based upon how deep into games he goes (pitchers who last into the eighth inning are far more likely to receive a win than those who only last four or five innings — there's more time for his offense to score runs. The small problem here is that unlucky pitchers won't go as deep into games as they should, and visa-versa for lucky pitchers, but I haven't accounted for this yet).
Now I'm not saying that all of these stats are perfect, and they all assume randomly sequenced events (which may or may not be a 100 percent fair assumption) but I do think that they largely serve our purposes and are certainly better than making mental estimations (as we all currently do) or simply assuming everyone will be league average. Again, if you have any questions, absolutely feel free to let me know. Tomorrow, be on the lookout for an article centered around Ricky Nolasco that will make use of these stats, so you can see them in action.
Prior work done on the subject
EDIT: Thanks to Will Larson for bringing to my attention that prior work has been done on some of these topics. Will created his own versions of xW, xBABIP, and xLOB% that can be found here.
THTF's own Paul Singman also did work on the link between BAA and LOB% here.
David Appleman also created a basic xBABIP formula here.
Derek Carty, 23, has also been published by NBC's Rotoworld, Sports Illustrated, FOX Sports, and USA Today. This season, he'll be contributing to FanDuel and will be linking to all of his work at DerekCarty.com. In his three years competing in expert leagues, he has won 2 titles with 4 top three finishes, including a LABR NL title in 2009, making him the youngest person to ever win a major expert league title. Derek is a proud graduate of the MLB Scouting Bureau's Scout Development Program and is a firm believer in the importance of combining stats and scouting. He welcomes questions via e-mail, Facebook, or Twitter.