November 20, 2009
Order NowThe Hardball Times Baseball Annual 2010 is now in development and will ship in mid November! This year's book will feature articles by THT's staff as well as Bill James, Tom Tango and Craig Wright. If you use this link to purchase the Annual, you will be in the first group to receive it and you'll be supporting THT. Most Recent Comments
Sportswriters don’t vote for Cy Young based on popular opinion; baseball universe explodes (14)
Where could Omar Vizquel and Nick Johnson end up? (6) You’re not Alou (8) Is peak at age 29? (6) A defensive shift (3) ![]()
Or you can search by:
Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets. Find premium Chicago Cubs tickets and other Chicago tickets at JustGreatTickets.com. Chicago Cubs Tickets Chicago Tickets ![]() All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License. |
Tuesday, November 10, 2009Criminals of WARPosted by Jeremy GreenhouseI’m not going to single anyone out, since we’re all guilty of abusing FanGraphs’ Wins Above Replacement metric. But I’ve been seeing cases pop up where it’s getting out of hand. So I’ve set up a few guidelines for how to go about using WAR responsibly. Do not break these rules, or I may call you out. 1. Do not exclude baserunning from a position player’s WAR. I’m sure David Appelman will include baserunning in the next edition of WAR, since it’s so easy to calculate, but the numbers are already out there, so please take the time to go to BP, B-Ref, or BJOL to look up the numbers and tack them on. 2. Do not place undue trust in WAR for catchers. How much of a catcher’s value do you think is in his defense? I’ll give you a hint: it’s a lot. FanGraphs has unfortunately yet to give an effort to quantifying this vital aspect of the game, other than with the positional adjustment. In fact, catchers should possibly be considered a separate group of players with a separate replacement level and therefore be treated as different from all other position players. 3. Do not place undue trust in WAR for pitchers. First off, pitcher defense and hitting aren’t included. This should be righted ASAP. Then there are the more nuanced issues like how leverage is accounted for and the conversion of FIP to runs. Personally, I’d trust the calculations of David Gassko’s pitching runs created or StatCorner’s WAR well before I would FanGraphs’ WAR. 4. Do not cite WAR as a measure of skill. WAR measures production. FanGraphs has a lot more granular data if you’re trying to assess skill. And if you’re going to try to make a projection of WAR, regress each component individually. Also, players with negative WAR still may have value if they excel at a certain skill that can be leveraged. 5. Do not use the linear conversion of WAR to salary to determine what a team should be willing to pay a free agent. Every team has a different scale, depending on that team’s market and where the team is on the win curve. Few teams should pay $5 million for a single win. I'm sure there are other commandments I'm missing, so feel free to add your own. Any questions? Feel free to email me. Comments
Dave Studeman said...
But FIP is production too: strikeouts, walks and home runs allowed. It’s a subset of ERA and it’s more constant, or more predictable, than ERA, but it’s still production. #5 is my personal favorite. I think the salary figures on Fangraphs are good guidelines, but that’s all they are. They’re the beginning of a good discussion about what a team should pay a player, not the end. Posted 11/10 at 05:18 PM
Think Blue Crew said...
Regarding #5, Vince Gennaro’s Diamond Dollars explains brilliantly why some teams would pay different $ depending on their current and expected win total. People tend to forget (I’m a culprit as well) that the marginal win changes for teams, even for the same team in different years. A middle-rotation starter is not worth much to the Pirates, as it will not mean much in terms of increasing playoff odds, but a wildcard team may pay much more for the same pitcher, depending how many (or few) wins the team needs in order to make the playoffs. Posted 11/10 at 05:32 PM
dkappelman said...
<> I don’t mean to get snippy, but this was a completely unsupported statement. tRA has been shown to be barely better than FIP. Neither PRAA or tRA WAR on stat corner adjust relievers for leverage to my knowledge. They don’t have dynamic run converters for starting pitchers as far as I know either. Also, none of the pitcher WAR include defense for pitchers, which is going to be pretty minimal, and you can always look up a pitcher’s WAR on offense (which we do calculate) and add it to their defense. What exactly is there not to “trust” about pitcher WAR on FanGraphs again? We have a really long series about how WAR is calculated exactly for pitchers. If you don’t like FIP used in a WAR calculation, then you shouldn’t like tRA either. All in all they’re pretty comparable and if you’re not going to trust one of them, you may as well not trust any of the others either because their methods of calculation are generally similar. Posted 11/11 at 11:37 AM
dkappelman said...
“Personally, I’d trust the calculations of David Gassko’s pitching runs created or StatCorner’s WAR well before I would FanGraphs’ WAR.” was the statement I was trying to quote for the post above. Posted 11/11 at 11:38 AM
MikeS said...
Thank you ever so much for point number 5. Beyond the obvious that the Yankees will pay more for one WAR than the Pirates, there is so much more to the economics. A decent shortstop or a middle of the rotation starter is worth much more to the Twins who may see it as the one piece they are missing than it is to the Royals who need so much more to even be worth noticing. It makes sense for the Twins to overpay for that, but not the Royals. Posted 11/11 at 12:54 PM
Adam W. said...
@dkappelman: I think he’s referring to issues like the one raised in this article: http://mobile.beyondtheboxscore.com/2009/10/28/1104776/ricky-nolasco-4-war-or-1-war Posted 11/11 at 01:03 PM
Jeremy Greenhouse said...
Dave, thanks for commenting. The entire way that we’re thinking about adjusting relievers for leverage might be flawed. The purpose is to find out the value of the pitcher, isolated from the context in which he pitches. The two best ways to do this are to either not account for leverage or to assign every single pitcher a “deserved” leverage index, including starters, based on the optimal average LI he should pitch in, independent of his actual LI. StatCorner doesn’t account for leverage, which I’m fine with, and PRC does pretty much what you need by adjusting the pitcher’s run environment. I don’t see why you’d say that defense is pretty minimal for pitchers. I’d guess a good fielding pitcher is worth five runs a year and a bad one worth negative five. It all adds up. I might be wrong about StatCorner’s WAR. I’ve never seen them write up their methodology to it, but I’ve been under the assumption they use the regressed version of tRA, and not tRA. If my assumption is false, I stand corrected. I understand you guys calculate a pitcher’s WAR on offense, and I should have mentioned that it’s available on FanGraphs. This was an indictment on people who cite WAR for pitchers without including offense, not on FanGraphs, which does have the data available. Posted 11/11 at 01:29 PM
Hecubot said...
The number one argument for the value of catcher’s defense is probably just the defensive spectrum. It’s clear that we haven’t got a handle on how to measure a catcher’s defensive production, but there are a lot of little clues coming forward. I particularly liked the study that showed that Piazza was a plus defender at blocking pitches in the dirt and was able to reclaim some of his defensive worth for his poor throwing. Stuff like that helps make clear why he was kept at catcher for so long. Posted 11/11 at 02:10 PM
dkappelman said...
Jeremy, thanks for clarifying. Starting pitchers on FanGraphs are not leverage adjusted on FanGraphs, because except for some strange cases, they’re all going to have an average leverage of 1 anyway. So there’s really nothing to complain about here, they are completely context neutral. Relief pitchers use a regressed gmLI leverage adjustment. So it only accounts for the situations they were used in. I see what you’re saying about optimal average LI, but WAR, like you said is not really predictive (with FIP maybe a little more so), so I’d say adjusting for leverage in the situation they actually pitched in does make sense. For what it’s worth, the adjustment is not huge. I think at most we’re applying a 1.5 LI adjustment once it’s regressed, because you’re not going to find any gmLI a whole lot greater than 2. Or, on the other side, nothing more than really .75. This could certainly slightly devalue relievers who are good but not optimally used. On the defense. These guys are really only out there for 200 innings. I agree with the -5 to +5 range, but this still makes up a relatively small part of a pitcher’s value. It’s not like position players where how a player plays defense is going to drastically change his perceived value. I’m not sure I see how the regressed version of tRA vs the non regressed version of tRA really changes the comparison to FIP they’re both going to be similar. I think the regressed version of tRA is going to water down the home run impact a little more than FIP will, but I may be wrong about that. We also use FIP in the WAR calculations because it completely takes out the defense, which then makes adding numbers up across entire teams work nicer so we’re not double counting defense somewhere. Even in that Nolasco case, we say he’s at 4.2, statcorner say he’s at 3.5. If you just take it on runs (leave out the dynamic win conversion) we think he’s at just about 3.6. Better pitchers in FanGraphs WAR will be even better because of the dynamic run to win converter because the way they inherently lower the run environment. You can always just look at the runs and divide by 10-ish if you want to see what things would look like without it. Posted 11/11 at 02:26 PM
Jeremy Greenhouse said...
I understand starters have the same average LI. My point is that all pitchers are from the same group of players. They (starters and relievers) shouldn’t be treated differently. I’m not smart enough to come up with the correct metric, but I’d imagine the replacement level (or whatever you want to call it) would be fluid, based on the expected outs per outing, and the deserved leverage index would be fluid, based on the expected outs per outing as well as the pitcher’s run environment. The regressed version of tRA tries to account for everything the pitcher controls, and dismiss everything he can’t control. That’s the purpose of WAR, no? Posted 11/11 at 02:44 PM
Nick Steiner said...
Statcorner calculates WAR using the straight park adjusted tRA. The regressed tRA is “just for show”. I agree that PRC is probably the best, mainly because it uses a dynamic run estimator I believe.
Regressed tRA just regresses tRA to league average based off of the sample size. So a 6.00 tRA will be around a 5.2 tRA*, or something. Posted 11/11 at 02:57 PM
Jeremy Greenhouse said...
Nick, I don’t know where you’re getting any of that. If you have an explanation of StatCorner’s pWAR, please pass that along. You’re off on your assessment of tRA*. It’s park adjusted and every component that goes into tRA is regressed individually. Homer Bailey had a tRA of 7.64 and tRA* of 5.03 while Miguel Batista had a tRA of 7.93 and tRA* of 6.58. Posted 11/11 at 03:20 PM
dkappelman said...
“My point is that all pitchers are from the same group of players. They (starters and relievers) shouldn’t be treated differently. I’m not smart enough to come up with the correct metric, but I’d imagine the replacement level (or whatever you want to call it) would be fluid, based on the expected outs per outing, and the deserved leverage index would be fluid, based on the expected outs per outing as well as the pitcher’s run environment.” Well, back to the original point, which is your complaint about leverage not being applied properly or on a sliding scale, if you’re making the same argument about replacement level in general, then it seems like we both agree there should be leverage adjustment for relievers, but you’re just skeptical of the way it’s being applied in FanGraphs WAR? At least we’re making some adjusting for leverage and relievers leverage wise and skill wise are used somewhat properly, so it’s not like the system we use is going to be really out of whack. Sure some relievers may be docked ever so slightly because they’re in the setup role instead of the closer role, but I don’t think those differences are going to be material in the vast majority of cases. I don’t see why wouldn’t treat relievers and starters different. They’re two different roles, and then leverage is essentially applied to further define the role of the reliever. I think all these systems have their merits and potential drawbacks, but I just thought it was particularly unfair of you to single out FanGraphs WAR for pitchers and say more or less, it’s untrustworthy. Otherwise, I agree with everything you’re saying in the article. Like all these stats, none are perfect or should be used in a vacuum, but I guess if you’re going to pick one, you could do a lot worse than WAR. Posted 11/11 at 04:23 PM
CH said...
“Like all these stats, none are perfect or should be used in a vacuum, but I guess if you’re going to pick one, you could do a lot worse than WAR.”
Posted 11/11 at 04:58 PM
Firpo said...
Not to be a noob, but what is BJOL? And where are the baserunning stats at B-Ref? Posted 11/12 at 10:08 AM
Nick Steiner said...
Jeremy - I’m saying the version of WAR that’s shown at StatCorner is calculated using tRA, which is park adjusted by NOT regressed. tRA* is the regressed version, and that’s not used for anything in particular. That’s why the statement I quoted from you above is confusing. Posted 11/12 at 10:37 AM
Jeremy Greenhouse said...
Firpo, Bill James Online, and if you go to a hitter’s page on Baseball Reference and scroll down, there’s a section with baserunning stats. Posted 11/12 at 11:26 AM
Jeremy Greenhouse said...
Dave, I appreciate you taking the time. I’m realizing I overstated my case when it came to pitcher’s WAR. The other points were all pretty much fact, and the non-defense/fielding arguments against pitcher WAR are based on theory. However, I still don’t think you should treat relievers and starters differently because they’re from the same group of players. Think of it in terms of positional adjustments. A one inning pitcher (think of innings in terms of expected outs) gets a negative positional adjustment because of the lack of scarcity and lack of difficulty at the position. A six inning pitcher gets a higher positional adjustment. And it’s all fluid in between. Same with leverage. A one inning pitcher with a great FIP has a high deserved leverage index, but a six inning pitcher with a great FIP should have a higher LI too, since theoretically he could be brought in the fourth inning. Value should be independent of how a manager uses his players, and only be based on what the player was able to control on the field. WAR is the best out there, and we all know that. But some people are missing its limitations, like baserunning, catcher defense, pitcher hitting/fielding, quality of opposition, and these aspects of the game need to be mentioned. Posted 11/12 at 02:03 PM
WY said...
Thanks for writing this. A lot of this needed to be said. It drives me nuts to see people say “So-and-so was worth $13.6M last year” as if that is an etched-in-stone fact. That is really sloppy thinking. Also, to Dave Appelman: I read and appreciate Fangraphs, but I didn’t interpret the main thrust of this article as anti-Fangraphs as much as I read it as a plea for people to think more critically about and be more careful with what they read there. Posted 11/14 at 01:22 AM | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
On #4 you say WAR is a production measure, but isn’t Fangraphs using FIP which is not necessarily “production” - see Javy Vasquez 2006 and 2008 - or maybe I’m not clear what they are doing with their WAR number.