10 things I didn’t know about superstatsby Chris Jaffe
April 01, 2013
Superstats. The formulas that swing for the fences and try to boil down a team’s entire value into one number. Hardcore sabermetricians can debate the finer points of how they assess baseball talent. For the rest of us, superstats help frame debates and ideas for how good a player or team was.
I’ve had plenty of chances to examine sabermetric superstats over the years. To put that more accurately, I’ve had plenty of chances to examine the results of sabermetric superstats over the years. I’m not very good at getting into their finer points, but it’s a funny thing. Even if you can’t really fully explain the math (and that’s certainly true of me), you can learn something about the stats just by viewing the results.
Let’s look backwards at these stats to see what we can learn about them. By “look backwards,” I mean start with the results to see what that tells us of the process. My own research over the years has caused me to compile team-level superstats whenever possible. In that spirit, the results we’re looking at are the full team numbers—hitting, pitching, and fielding—for 21st-century superstats Win Shares, Runs Above Replacement, and—most especially—the current default stat, Wins Above Replacement.
(In the case of WAR, we’re using the most recent version that came out a few days ago as a result of the Fangraphs-Baseball Reference agreement on where to set replacement level.)
1. Dividing fielding and pitching with Win Shares
Remember when Win Shares came out? My golly, what a big deal that was. When it first was released, the intriguing ideas were less in how Bill James tried to rate batting. James and others had already figured tons of that out. The intriguing parts were how it was going to try to separate pitching and fielding, a largely undiscovered country.
James had written about this in the New Historical Abstract that predated (and largely previewed) his Win Shares book. He spoke about needing to start at the team level with fielding and work your way from there. He was influenced by the work of Voros McCracken. This looked like it might be a big step forward in separating the difficult-to-disentangle pair of fielding and pitching.
The first thing I did when looking at the book was figure out the worst-fielding and worst-pitching teams of each decade. (No, I don’t know why I went that route. Must be the Cubs fan in me.) Anyhow, here they are. Tell me if you can spot a trend (we’ll ignore the strike riddles 1981, 1994, and 1995 campaigns).
Decade Worst PWS PWS Worst FWS FWS 1900s 1904 WAS 21.3 1904 WAS 14.2 1910s 1915 PHA 17 1915 PHA 11.4 1920s 1928 PHI 36.3 1928 PHI 24.2 1930s 1935 BOB 30.4 1935 BOB 20.3 1940s 1942 WAS 34.2 1942 WAS 22.8 1950s 1953 DET 37.3 1953 DET 24.9 1960s 1962 NYM 37.6 1962 NYM 25.1 1970s 1974 SDP 40.2 1972 TEX 25.2 1980s 1984 SFG 39.8 1984 SFG 26.5 1990s 1998 FLA 22.9 1998 FLA 22.9
Nine times, the worst pitching team was also the worst fielding team. (Okay, fine, the 1985 Giants were actually tied with the 1985 Indians for worst fielders.) The only exception came in the 1970s when the 1974 Padres were the second-worst fielding team, behind the 1972 Rangers.
Actually, it’s even more ridiculous. Those 1972 Rangers lost eight games to a strike, and if you adjust for that, their 25.2 FWS becomes 26.5, tying them with the 1974 Padres for last. So the worst-pitching team was always also the worst-fielding team.
That strike anyone else as just a tad questionable? While Win Shares may have opened new doors in figuring out how to separate gloves from arms, the downside of being the first post-Voros attempt is that it is likely to be the one most in need of refining.
2. Win Shares: the last big moment of pre-internet sabermetrics
These days, it’s almost impossible to find Win Shares anywhere. If you want to know how many Fielding Win Shares the 2011 Rockies had, I don’t know where you’d find it. I assume some Win Shares info is at Bill James Online, but I’m just too cheap to plunk money down. This is a stat that got a lot of fanfare when it came out. It was, after all an attempt at a Super Stat by Super Stat-Man Bill James himself, but it’s turned into an also-ran.
Ultimately, it’s turned into the last great event of pre-internet sabermetrics. Obviously, that’s not literally true, as the internet predates it and sabermetric books still come out. But a clear shift has occurred.
Back in the day, for a stat to be taken seriously, you had to kill trees to present it. If a stat existed solely on that newfangled, ephemeral internet, did it really matter? If it was in tangible form like a book, it was taken seriously.
Times, they’ve changed. Now a stat needs to be on the internet—and usually freely available on the internet—in order to matter. People rely on WAR all the time because it’s so readily available. BOOM, it’s there for you. Is it the best stat? Hell, I dunno. If it’s good enough for Sean Forman, it’s good enough for me, I guess. But ultimately, even if it isn’t the best, it’s still the stat de jour.
Win Shares was the last big sabemetric thing to come out in book format. Man, bad timing. Nowadays, I can’t even find team-level Win Shares, and I haven’t been able to for years.
3. Runs Above Replacement Player
There’s less to say about this superstat. Like Win Shares and WAR, it has three parts—RARP, PRAR (pitching runs above replacement), and FRAR—for hitting, pitching, and fielding. Thus, it makes a nice contrast. RARP never had quite the heft of Bill James’ Win Shares nor the current prominence of WAR, but it was a big stat by Baseball Prospectus, and that gives it serious credibility.
After my experience with Win Shares, I checked this to see if it constantly had the worst-fielding team of the decade double up as the worst-pitching team. To RAR’s credit, that never happened.
That said, at least the strange parts of Win Shares could be understood. That wasn’t always the case with RAR. Here’s the best example, ranking the 1930 NL teams by PRAR:
Team PRAR STL 206 PHI 187 CIN 163 BOB 163 CHC 149 PIT 136 BRK 120 NYG 94
The 1930 Phillies really shouldn’t be second. They’re arguably the worst pitching staff of all time. They set records for most hits allowed, worst ERA, etc. You name it, this staff was awful at it.
Sure, the Phillies played in a major hitters' park, their defense was bad, and 1930 was The Year of the Hitter, but that that only goes so far to explain things. They were worst in walks allowed, worst in home runs, and second-worst in strikeouts. Tom Ruane at Retrosheet gave a nice overview of their awfulness. Putting the Phils sixth out of eight—and nearly in fifth—would be like creating a stat that says the 1962 Mets were pretty good. Everything else, including everything else sabermetric, puts that staff in the toilet.
That’s the problem with RAR: when it goes wonky, it’s hard to understand.
Ultimately, this is a past tense issue. Team-level RAR info doesn’t seem to be available at B-Pro anymore. WAR has the field to itself these days. So let’s look at it.
4. Fielding and WAR
Now, for the great overlooked fact about WAR. As far as it’s concerned, there’s pretty much no difference between average fielding and replacement-level fielding. In fact, you can cut out the words “pretty much” from that last sentence. An average fielding team scores 0.0 dWAR.
Replacement level and average traditionally have two different meanings. In fact, Bill James pioneered work on replacement level in response to Pete Palmer’s linear weights, which were designed to give an average player no value. James countered that the 50th percentile players have considerable value. After all, an average team will have an average record, not a bad one.
But the 2,600-plus teams on file combine to have 1.3 total defensive WAR. That’s not 1.3 dWAR per team, it’s 1.3 for all 2,653 squads. And if you toss out the 19th century, it’s actually a little worse. Teams from 1900 onward score a total –0.6 dWAR.
Now, this isn’t actually a secret, and no one tries to hide it. It’s just that people often aren’t aware of it. The logic behind it this approach is that there is no such thing as replacement-level fielding (or even replacement-level hitting) in and of itself. You have replacement-level players, not parts of players, and of course the glove has to go with the bat. Hitting matters more than fielding, so it’s a bigger part of determining who gets to play.
Ultimately, a great hitter will find his way in the lineup. It just that he’ll play at the shallow end of the defensive spectrum. But if a great glove truly can’t hit, he might last a while, but not for too long.
It doesn’t mean that WAR’s decision to center overall league fielding value at zero wins is a problem, but it helps at least to be aware of it.
5. Divvying up WAR
OK, if WAR gives all fielding virtually no value, then what does it do for pitching and hitting?
Well, here are the results for all 2,653 teams since 1876 (not counting the Union Association): 48,104.9 batting WAR, 33,432.0 pitching WAR, and the 1.3 fielding WAR.
That works out to 59 percent of all wins going to hitters, 41 percent to pitchers, and 0.0016 percent to gloves (that’s 16 ten-millionths).
For comparison, Win Shares is 48 percent batters. Of the remaining 52 percent, pitchers get two-thirds, fielders the other third.
With RARP, its … different. For all teams, hitters make up 31 percent, pitchers 29 percent, and fielders 40 percent. Huh. To be fair, RARP varies widely from era to era. Cut out the 19th century, for instance, and fielding falls all the way down from 40 to 29 percent.
6. Replacement level and WAR
As noted up top, all WAR numbers are the newly updated ones, with Baseball-Reference and Fangraphs agreeing where to set replacement level. Want to know what that level is?
It’s a .297 winning percentage. If you take all the non-UA teams since 1876 and adjust their actual win-loss records (which, of course, add up to .500: 200,893-200,893) and adjust by WAR, their records have a .297 winning percentage (119,355-282,431). A .297 percentage is a 48-114 record. That’s replacement level.
If you throw out the nineteenth century, it doesn’t change things much. From 1947 onward, replacement level is .294, which still rounds off to 48-114.
It’s around .295 in general, not for all teams. The 20th-century teams with the most extreme WAR-adjusted records are the 1909 Pirates and the 1962 Mets. The Pirates went 110-42 with 49.1 WAR, giving them an adjusted record of 60.9-91.1, which is .401. The 1962 Mets had 9.6 WAR, which transforms their dismal 40-120 record to an even worse 30.4-129.6 record (.190).
7. WAR weirdness with the 1991 Braves
Now for the strangest finding in all WAR-dom: it lists the 1991 Braves as the worst-fielding team in the 1991 NL.
Why is that so strange, you ask? Well, Win Shares claims they were the best-fielding team of the year. FRAR agrees. That’s the only time this happens. There are plenty of times two stats agree that a club has the best fielders or pitchers or batters in the league, but the 1991 Braves fielders are the only time the third stat puts them dead last.
And WAR puts those Braves well back, with –9.6 dWAR. That’s one of the worst scores of the decade.
It gets stranger still. This 1991 Braves defense has gotten attention previously in sabermetric circles. When Voros McCracken’s DIPS notion erupted, there was a big, well-received presentation at a SABR convention on the subject by Diamond Mind founder Tom Tippett. In his talk, Tippett looked at how defenses handled balls in play. He found that one of the greatest improvements in defense any team ever had was … the 1991 Braves.
Sure enough, they had an NL-best defensive efficiency mark of .714 just a year after a league-worst showing of .676. Based on McCracken’s work, Tippett argued that the massive improvement in defense helped the 65-97 1990 Braves become the 94-68 division winner the next year. Yet there’s WAR, calling this 1991 Braves team a historically bad defense.
Wait, then how can WAR explain their 29-game improvement? Simple: according to WAR, the Braves had the single biggest one-year improvement in pitching in baseball history: from 4.2 pWAR in 1990 to 30.6 in 1991, an increase of 26.4.
It gets a little stranger still. You see, WAR agrees with everyone else that the Braves defense improved dramatically; it just has the improvement occurring one year later. In 1992, their dWAR went up to 5.9, a rise of 15.5, the second-highest showing in history.
Does this make no sense to anyone else? By all accounts except for WAR, the 1991 Braves were one of the best defenses ever, and the team improved dramatically. WAR agrees that after 1991 the Braves defense was fine, but 1991 was historically horrible. Really? Something is up.
8. Most extreme improvements and collapses
Actually, the above can lead to a new question. What are the biggest improvements or declines a team ever had in pitching, fielding, or hitting according to WAR?
Well, remember how the 1992 Braves had the second-biggest fielding improvement (15.5)? That’s nowhere near the top slot. The 1980 A’s own this one—and how. After the terrible (54-108) 1979 A’s had a historically dreadful score of –11.5, the massively improved (83-79) 1980 A’s of Billy Martin had a league-best dWAR of 10.8. That’s an improved of 22.3, 50 percent higher than the runner up. In fact, the 1979 club is the seven-worst fielding unit in baseball history and the 1980 club third-best. Yowza!
The biggest defensive decline goes to the 1993 Padres. They were a good-fielding team in 1992, but their mark dropped by 17.4 in 1993. Their dWAR that year was –11.5, one of the worst scores of the decade.
We already covered how the 1991 Braves had the biggest pitching improvement (runner up: 2004 Rangers), but what’s the biggest decline? The 1968 Reds, who flopped by 26.4 pWAR from 1967. The real story here is the 1967 Reds, which WAR calls the third-best staff in history, at 32.0 pWAR.
In 1967, their fifth starter had an ERA+ of 99. A year later, that would’ve been second-best. Last year’s Phillies have the second-biggest decline by a staff. Then again, the story is the 2011 Phillies, who WAR calls easily the best pitching staff in baseball history. With 37.4 pWAR, they’re four-and-a-half wins over anyone else.
As it happens, a 1960s Reds team also has the biggest one-year decline ever in offense. The 1966 Reds were 23.1 oWAR off the pace of their 1965 club. Two things happened. First, the 1965 Reds infield is one of the all-time great (and overlooked) lineups in history. All eight starters and the three top bench players had an OPS+ in triple digits, and they were almost all 120 or higher. While the talent was great, they were also over their heads. That was the first problem.
Second, the team figured they had batting to burn, and so Cincinnati traded a bat to Baltimore for a starting pitcher. That bat was Frank Robinson, who won the Triple Crown for the 1966 Orioles. As it happens, their offense crashed completely, with only one starter posting an OPS+ over 103. Fortunately their pitching took off in 1967 ... before crashing in 1968. Lots of Reds fans probably had whiplash after all that.
The biggest batting improvement belongs to the 1978 Milwaukee Brewers, which is right when the Harveys Wallbanger bunch began taking off. (The second-best improvement, incredibly, comes from the 1968 Reds. I told you 1960s Reds fans got a bad case of whiplash).
All this leads to the real question: what does WAR consider to be the best- and worst-fielding teams of all time.
Well, we already noted that the 2012 Phillies came in first on the mound with 37.4 pWAR. The worst club ever is the 1890 Pirates, whose –17.5 pWAR narrowly edges the infamous 20-134 1899 Cleveland Spiders' –17.2 pWAR mark. Aside from them, the worst mark ever belongs to the 1995 Giants at –10.4. Gee, imagine how those Giants would’ve scored if 18 games hadn’t been wiped out by the strike.
The worst-fielding club is a 20th-century squad: the 1974 Cubs, at –14.4 dWAR. If you ever wonder why Rick Reuschel scores as well as he does in WAR (68.3 WAR, better than Jim Palmer, Bob Feller, or Juan Marichal), there you go. The 1991 Braves score as the 25th-worst fielding team of all time. Yeah, something is really wrong with that.
The best fielding team of all time, in a landslide, is Earl Weaver’s 1973 Orioles with Mark Belanger, Bobby Grich, and Brooks Robinson manning the infield. Their 13.6 dWAR is two wins better than the runner up, the 1969 Orioles. Man, those early Weaver clubs sure could field. (This also explains why Palmer rates lower than Reuschel in WAR). Those are the only clubs atop the 1980 A’s.
Only 10 clubs have negative offensive WAR. The worst is the 1914 St. Louis Federal League squad at –3.3 oWAR. The only team since 1920 with negative offensive WAR is the 1981 Blue Jays.
The top three offenses of all time are Babe Ruth-Lou Gehrig Yankee teams. Here they are:
Team oWAR 1931 NYY 46.0 1927 NYY 44.2 1930 NYY 43.1Then comes the 1976 Big Red Machine.
10. Most extreme teams
One last bit to get us to 10. What are the most extreme teams ever, those that rely most on one facet of the game?
For this, we have to limit it to teams with winning records. Otherwise, you get really horrible teams dominating the list. After all, if a club has negative WAR overall but is positive in one category, that club relies on that one item in a way no good team possibly can. Also, we’ll toss the 19th century because it was a very different game, and that era would end up dominating the list otherwise.
Among the 1,232 teams from 1900 onward with a record of .500 or better, the club most reliant upon pitching was the 1977 Cubs. Their arms accounted for 27.1 of the team’s 31.5 WAR. Bruce Sutter became a star for this team. Just three years removed from the worst defense ever, their fielding still sucked.
One other fun fact about this team. If you adjust run support for park and league run environment, Cubs ace Reuschel had the worst run support for any 20-game winner in the last 100 years. The Cubs scored 131 runs in his 37 starts, and while 3.54 runs/game doesn’t sound bad, Wrigley Field was an extreme hitters' park, and 1977 was a huge year for hitters. Reuschel's 3.54 runs/game was just 72 percent of what he should’ve gotten. No hitting, no fielding ... no wonder the 1977 Cubs score so pitcher-rific.
At the other end are the 1964 Braves with just 1.3 of their 38.8 WAR coming from pitchers. Somehow they won 88 games anyway.
The .500 or better club most reliant on fielding was the 1984 Twins, as one-third of their WAR came from gloves. It breaks down at the other end, though, because nearly a third of the winning 20th-century teams have negative fielding WAR. The 1981 Indians come in last, -6.9 dWAR on a team with 21.7 overall WAR. The 2004 Yankees are the only 100-win team with negative fielding WAR.
Lastly, there’s hitting. Twice a team had a winning record despite its pitchers and fielders combining for negative WAR: the 1940 Pirates and 1922 Cardinals were carried to winning records by their bats. At the other end, the 1914 Federal League Buffalo team had a winning record despite negative offensive WAR. Since WWI, the least bat-reliant winning team was the 2003 Dodgers: 4.9 of their 33.0 WAR came from hitters.
History instructor by day, statnerd by night, Chris Jaffe leads one of the most exciting double lives imaginable; with the exception of every other double life possible to imagine. Despite his lack of comic-book-hero-worthiness, Chris enjoys farting around with this stuff. His new book, Evaluating Baseball's Managers is available for order. Chris welcomes responses to his articles via e-mail. Oh, and now he's on twitter.
<< Return to Article