Should voters give extra credit?

We’ve all been thinking about the Hall of Fame a lot lately. In fact, this is my second such post in as many days, but I have an idea that I think is worth considering. It grows out of a Rob Neyer post where he mentions not wanting to support a player who would not not have been a Hall of Famer without steroids. This pushes someone like Mark McGwire off the list, for instance.

But here’s a question: If we’re docking players who weren’t clean, shouldn’t we give extra credit to those who were?

Think about it for a minute. The stat that gets tossed around the most in Hall of Fame discussions is WAR, but PEDs changed where the bar was set. The definition of what constituted a replacement player was different than it would have been without all those players getting extra help.

The best example I can think of is Fred McGriff. He is defined by his consistency. While playing for the Blue Jays in 1988, McGriff was, according to FanGraphs, worth 7.2 WAR. In 2001 putting up almost identical numbers, he was worth only 3.8 WAR. That’s still a good season, but the difference between 4 WAR and 7 WAR is the difference between an all-star and an MVP contender. How much of that perceived drop in value is the result of other players artificially raising the bar?

McGriff is a very marginal candidate now, but I’ve never heard him tied to PEDs. There’s no reason to think he wasn’t totally clean. And if he was clean and had played in a clean league, wouldn’t he have been worth more? Maybe 70 WAR instead of 61?

There are others. Kenny Lofton, Craig Biggio, Jeff Bagwell. Pick a player who has a decent case and hasn’t been tied to steroids at all and ask yourself how he might have been affected. From 1987 to 2002, Fred McGriff was almost exactly the same player every year. His value declined not because his performance changed (other than normal yearly fluctuations of course), but because offensive numbers around the league changed.

This is all getting very complicated. It’s a debate we’re going to be having for years and I wonder if, in a few years, we might regret that some players—like Lofton—fell off the ballot so quickly.


Jason teaches high school English, writes fiction, runs a small writing program and writes about education and literature. He also writes for Redleg Nation and both writes and edits for The Hardball Times. Follow him on Twitter @JasonLinden, visit his website or email him here.
5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Nate
11 years ago

It’s an interesting point, for sure. There are some mitigating issues with your archetypal case, though. You’ve pointed out that McGriff’s offensive numbers are similar in 1988 and 2001. But WAR is not just based on offensive numbers, of course, and even those offensive numbers are park-adjusted. It’s also a cumulative stat. So the important problems:

Per fangraphs, his position / fielding were worth 1.0 and -11.6 runs in ‘88 and ‘01, respectively. So that’s ~1.2 WAR right there.

His baserunning was 0.3 in ‘88 and -0.7 in ‘01. He also played 8 more games in ‘88, meaning his replacement 20.8 v. 19.5. Not huge, but that’s another 0.2 WAR.

Park adjustment is a little trickier to track in fangraphs, since it’s embedded in the very thing you’re talking about (WAR). But if you look at the “neutralized batting” at baseball reference, you’ll see that the difference, park-wise, in playing in Toronto in ‘88 v. TB and CHC in ‘01 is substantial. His neutralized run production on the offensive side, at least according to the neutralized RC stat at BR, is 120 v. 107. The scales are different, etc., but it looks like maybe 1.0 of WAR was due to hitting in TOR instead of TB/Wrigley.

So, that cursory look explains about 2.4 WAR between ‘88 and ‘01. This is obviously back of the envelope calculating, and the park adjustment aspect is tricky, sure. But given that the difference between 7.2 and 3.8 is just 3.4 WAR, and that maybe 2.4 of this is explained by the above, I think the effect you’re talking about is there but perhaps overstated above. I didn’t check, but I’m assuming you picked those two years as the most egregious example of similar stats disparate WAR. But that huge gap in WAR looks a bit smaller after scrutiny.

And obviously chalking up ALL of the difference in run environment / replacement level to PEDs is probably overstating things. The two years have different scoring environments, and it certainly looks like a lot of that was due to rampant PED use, but that certainly doesn’t mean all of it was – run-scaroing environments fluxuate sometimes sans explanation (see, e.g., 1987).

All of which is to say that I think this is a very interesting point, but the difference in WAR is probably not as large as 70 to 61 if one of the big example years you’re pointing at is only 1.0 (and we have no idea what portion of that is due to PED use and what portion is due to weather, smaller ballparks, etc.).

Not trying to be a jerk, because this is the first time I’ve seen this particular issue raised (that non-cheaters need positive credit, not just neutral credit), and I think it’s a smart observation. And it could still very much be true, it’s just that this example in particular has some problems (most overtly the defensive difference between a 24 and a 37 year-old firstbaseman smile ).

Jason Linden
11 years ago

I think that’s totally fair, and I’ll admit to cherry-picking a bit to raise the issue. I did note the defensive difference, but didn’t think about some of the other things you mention.

I don’t think you’re being a jerk at all. Where it all gets tricky is in trying to figure out how much of the offensive change is PEDs and how much is something else. It’s all so blurry.

Nate
11 years ago

Cool. I just wanted to be clear that I’m not just tearing down the idea for the sake of saying, “nuh-uh.”

Another thing that has confused me about this issue entirely is that it’s not obvious why rampant ped use on both sides (pitching and hitting) automatically leads to higher run scoring. Unless it’s just the nature of pitching that increases in brute strength have a much smaller impact than on the hitting side. I.e. the limiting factor are what ligaments can withstand, not muscles. But, you know, “Clemens.” Still, I guess what’s obvious is that in the ped competition, pitchers overall lost the arms race. So to speak. smile

Anyways, thanks for the thought-provoking article!

Jason Linden
11 years ago

I’ve been thinking about the fielding issue and one of the things about PEDs that people don’t talk about much is that they help players stay on the field. Since those players are presumably better all around than the players who would be replacing them, and since fielding numbers are measured against the average instead of against replacement, then the PED era has to call fielding numbers into question too. Maybe not as much as offensive numbers, but you still have to give some extra credit there, I think.

Nate
11 years ago

Good point. It does seem, though, like PEDs would keep sub-par fielders on the field, too, and given the era’s emphasis on offense, I’d think that there would be plenty of instances of good-hitter-bad-fielder being kept on the field. It would probably differ a lot by position.

It seems like it’s straight-forward enough to argue that PEDs will indeed “enhance performance” and raise league averages and replacement values. So I would buy an argument a la hitting that fielding is better because players are faster, less fatigued, stronger throwers and such. But without knowing which “player-games lost due to injury” are being reduced – whether it’s systematically more above or below average fielders – it’s going to be really hard to know how much extra credit to give. And, of course, the fielding effects are likely to be subtler than the obvious soaring HR totals and such.

It also depends just how rampant the PED usage was, how many player-games were actually saved, etc. Because if it’s 10% of players and they’re preventing, say, five games of injury on average, that fielding effect will probably not be huge. But if it’s 90%, then it will … but if it’s 90%, or anything approaching that, it’s probably not safe to assume that anyone was clean. Even if they are, you know, crime dogs.