10 things I didn’t know about superstats

Superstats. The formulas that swing for the fences and try to boil down a team’s entire value into one number. Hardcore sabermetricians can debate the finer points of how they assess baseball talent. For the rest of us, superstats help frame debates and ideas for how good a player or team was.

I’ve had plenty of chances to examine sabermetric superstats over the years. To put that more accurately, I’ve had plenty of chances to examine the results of sabermetric superstats over the years. I’m not very good at getting into their finer points, but it’s a funny thing. Even if you can’t really fully explain the math (and that’s certainly true of me), you can learn something about the stats just by viewing the results.

Let’s look backwards at these stats to see what we can learn about them. By “look backwards,” I mean start with the results to see what that tells us of the process. My own research over the years has caused me to compile team-level superstats whenever possible. In that spirit, the results we’re looking at are the full team numbers—hitting, pitching, and fielding—for 21st-century superstats Win Shares, Runs Above Replacement, and—most especially—the current default stat, Wins Above Replacement.

(In the case of WAR, we’re using the most recent version that came out a few days ago as a result of the Fangraphs-Baseball Reference agreement on where to set replacement level.)

1. Dividing fielding and pitching with Win Shares

Remember when Win Shares came out? My golly, what a big deal that was. When it first was released, the intriguing ideas were less in how Bill James tried to rate batting. James and others had already figured tons of that out. The intriguing parts were how it was going to try to separate pitching and fielding, a largely undiscovered country.

James had written about this in the New Historical Abstract that predated (and largely previewed) his Win Shares book. He spoke about needing to start at the team level with fielding and work your way from there. He was influenced by the work of Voros McCracken. This looked like it might be a big step forward in separating the difficult-to-disentangle pair of fielding and pitching.

The first thing I did when looking at the book was figure out the worst-fielding and worst-pitching teams of each decade. (No, I don’t know why I went that route. Must be the Cubs fan in me.) Anyhow, here they are. Tell me if you can spot a trend (we’ll ignore the strike riddles 1981, 1994, and 1995 campaigns).

Decade	Worst PWS	PWS	Worst FWS	FWS
1900s	1904 WAS	21.3	1904 WAS	14.2
1910s	1915 PHA	17	1915 PHA	11.4
1920s	1928 PHI	36.3	1928 PHI	24.2
1930s	1935 BOB	30.4	1935 BOB	20.3
1940s	1942 WAS	34.2	1942 WAS	22.8
1950s	1953 DET	37.3	1953 DET	24.9
1960s	1962 NYM	37.6	1962 NYM	25.1
1970s	1974 SDP	40.2	1972 TEX	25.2
1980s	1984 SFG	39.8	1984 SFG	26.5
1990s	1998 FLA	22.9	1998 FLA	22.9

Nine times, the worst pitching team was also the worst fielding team. (Okay, fine, the 1985 Giants were actually tied with the 1985 Indians for worst fielders.) The only exception came in the 1970s when the 1974 Padres were the second-worst fielding team, behind the 1972 Rangers.

Actually, it’s even more ridiculous. Those 1972 Rangers lost eight games to a strike, and if you adjust for that, their 25.2 FWS becomes 26.5, tying them with the 1974 Padres for last. So the worst-pitching team was always also the worst-fielding team.

That strike anyone else as just a tad questionable? While Win Shares may have opened new doors in figuring out how to separate gloves from arms, the downside of being the first post-Voros attempt is that it is likely to be the one most in need of refining.

2. Win Shares: the last big moment of pre-internet sabermetrics

These days, it’s almost impossible to find Win Shares anywhere. If you want to know how many Fielding Win Shares the 2011 Rockies had, I don’t know where you’d find it. I assume some Win Shares info is at Bill James Online, but I’m just too cheap to plunk money down. This is a stat that got a lot of fanfare when it came out. It was, after all an attempt at a Super Stat by Super Stat-Man Bill James himself, but it’s turned into an also-ran.

Ultimately, it’s turned into the last great event of pre-internet sabermetrics. Obviously, that’s not literally true, as the internet predates it and sabermetric books still come out. But a clear shift has occurred.

Back in the day, for a stat to be taken seriously, you had to kill trees to present it. If a stat existed solely on that newfangled, ephemeral internet, did it really matter? If it was in tangible form like a book, it was taken seriously.

Times, they’ve changed. Now a stat needs to be on the internet—and usually freely available on the internet—in order to matter. People rely on WAR all the time because it’s so readily available. BOOM, it’s there for you. Is it the best stat? Hell, I dunno. If it’s good enough for Sean Forman, it’s good enough for me, I guess. But ultimately, even if it isn’t the best, it’s still the stat de jour.

Win Shares was the last big sabemetric thing to come out in book format. Man, bad timing. Nowadays, I can’t even find team-level Win Shares, and I haven’t been able to for years.

3. Runs Above Replacement Player

There’s less to say about this superstat. Like Win Shares and WAR, it has three parts—RARP, PRAR (pitching runs above replacement), and FRAR—for hitting, pitching, and fielding. Thus, it makes a nice contrast. RARP never had quite the heft of Bill James’ Win Shares nor the current prominence of WAR, but it was a big stat by Baseball Prospectus, and that gives it serious credibility.

A Hardball Times Update
Goodbye for now.

After my experience with Win Shares, I checked this to see if it constantly had the worst-fielding team of the decade double up as the worst-pitching team. To RAR’s credit, that never happened.

That said, at least the strange parts of Win Shares could be understood. That wasn’t always the case with RAR. Here’s the best example, ranking the 1930 NL teams by PRAR:

Team	PRAR
STL	206
PHI	187
CIN	163
BOB	163
CHC	149
PIT	136
BRK	120
NYG	 94

The 1930 Phillies really shouldn’t be second. They’re arguably the worst pitching staff of all time. They set records for most hits allowed, worst ERA, etc. You name it, this staff was awful at it.

Sure, the Phillies played in a major hitters’ park, their defense was bad, and 1930 was The Year of the Hitter, but that that only goes so far to explain things. They were worst in walks allowed, worst in home runs, and second-worst in strikeouts. Tom Ruane at Retrosheet gave a nice overview of their awfulness. Putting the Phils sixth out of eight—and nearly in fifth—would be like creating a stat that says the 1962 Mets were pretty good. Everything else, including everything else sabermetric, puts that staff in the toilet.

That’s the problem with RAR: when it goes wonky, it’s hard to understand.

Ultimately, this is a past tense issue. Team-level RAR info doesn’t seem to be available at B-Pro anymore. WAR has the field to itself these days. So let’s look at it.

4. Fielding and WAR

Now, for the great overlooked fact about WAR. As far as it’s concerned, there’s pretty much no difference between average fielding and replacement-level fielding. In fact, you can cut out the words “pretty much” from that last sentence. An average fielding team scores 0.0 dWAR.

Replacement level and average traditionally have two different meanings. In fact, Bill James pioneered work on replacement level in response to Pete Palmer’s linear weights, which were designed to give an average player no value. James countered that the 50th percentile players have considerable value. After all, an average team will have an average record, not a bad one.

But the 2,600-plus teams on file combine to have 1.3 total defensive WAR. That’s not 1.3 dWAR per team, it’s 1.3 for all 2,653 squads. And if you toss out the 19th century, it’s actually a little worse. Teams from 1900 onward score a total –0.6 dWAR.

Now, this isn’t actually a secret, and no one tries to hide it. It’s just that people often aren’t aware of it. The logic behind it this approach is that there is no such thing as replacement-level fielding (or even replacement-level hitting) in and of itself. You have replacement-level players, not parts of players, and of course the glove has to go with the bat. Hitting matters more than fielding, so it’s a bigger part of determining who gets to play.

Ultimately, a great hitter will find his way in the lineup. It just that he’ll play at the shallow end of the defensive spectrum. But if a great glove truly can’t hit, he might last a while, but not for too long.

It doesn’t mean that WAR’s decision to center overall league fielding value at zero wins is a problem, but it helps at least to be aware of it.

5. Divvying up WAR

OK, if WAR gives all fielding virtually no value, then what does it do for pitching and hitting?

Well, here are the results for all 2,653 teams since 1876 (not counting the Union Association): 48,104.9 batting WAR, 33,432.0 pitching WAR, and the 1.3 fielding WAR.

That works out to 59 percent of all wins going to hitters, 41 percent to pitchers, and 0.0016 percent to gloves (that’s 16 ten-millionths).

For comparison, Win Shares is 48 percent batters. Of the remaining 52 percent, pitchers get two-thirds, fielders the other third.

With RARP, its … different. For all teams, hitters make up 31 percent, pitchers 29 percent, and fielders 40 percent. Huh. To be fair, RARP varies widely from era to era. Cut out the 19th century, for instance, and fielding falls all the way down from 40 to 29 percent.

6. Replacement level and WAR

As noted up top, all WAR numbers are the newly updated ones, with Baseball-Reference and Fangraphs agreeing where to set replacement level. Want to know what that level is?

It’s a .297 winning percentage. If you take all the non-UA teams since 1876 and adjust their actual win-loss records (which, of course, add up to .500: 200,893-200,893) and adjust by WAR, their records have a .297 winning percentage (119,355-282,431). A .297 percentage is a 48-114 record. That’s replacement level.

If you throw out the nineteenth century, it doesn’t change things much. From 1947 onward, replacement level is .294, which still rounds off to 48-114.

It’s around .295 in general, not for all teams. The 20th-century teams with the most extreme WAR-adjusted records are the 1909 Pirates and the 1962 Mets. The Pirates went 110-42 with 49.1 WAR, giving them an adjusted record of 60.9-91.1, which is .401. The 1962 Mets had 9.6 WAR, which transforms their dismal 40-120 record to an even worse 30.4-129.6 record (.190).

7. WAR weirdness with the 1991 Braves

Now for the strangest finding in all WAR-dom: it lists the 1991 Braves as the worst-fielding team in the 1991 NL.

Why is that so strange, you ask? Well, Win Shares claims they were the best-fielding team of the year. FRAR agrees. That’s the only time this happens. There are plenty of times two stats agree that a club has the best fielders or pitchers or batters in the league, but the 1991 Braves fielders are the only time the third stat puts them dead last.

And WAR puts those Braves well back, with –9.6 dWAR. That’s one of the worst scores of the decade.

It gets stranger still. This 1991 Braves defense has gotten attention previously in sabermetric circles. When Voros McCracken’s DIPS notion erupted, there was a big, well-received presentation at a SABR convention on the subject by Diamond Mind founder Tom Tippett. In his talk, Tippett looked at how defenses handled balls in play. He found that one of the greatest improvements in defense any team ever had was … the 1991 Braves.

Sure enough, they had an NL-best defensive efficiency mark of .714 just a year after a league-worst showing of .676. Based on McCracken’s work, Tippett argued that the massive improvement in defense helped the 65-97 1990 Braves become the 94-68 division winner the next year. Yet there’s WAR, calling this 1991 Braves team a historically bad defense.

Wait, then how can WAR explain their 29-game improvement? Simple: according to WAR, the Braves had the single biggest one-year improvement in pitching in baseball history: from 4.2 pWAR in 1990 to 30.6 in 1991, an increase of 26.4.

It gets a little stranger still. You see, WAR agrees with everyone else that the Braves defense improved dramatically; it just has the improvement occurring one year later. In 1992, their dWAR went up to 5.9, a rise of 15.5, the second-highest showing in history.

Does this make no sense to anyone else? By all accounts except for WAR, the 1991 Braves were one of the best defenses ever, and the team improved dramatically. WAR agrees that after 1991 the Braves defense was fine, but 1991 was historically horrible. Really? Something is up.

8. Most extreme improvements and collapses

Actually, the above can lead to a new question. What are the biggest improvements or declines a team ever had in pitching, fielding, or hitting according to WAR?

Well, remember how the 1992 Braves had the second-biggest fielding improvement (15.5)? That’s nowhere near the top slot. The 1980 A’s own this one—and how. After the terrible (54-108) 1979 A’s had a historically dreadful score of –11.5, the massively improved (83-79) 1980 A’s of Billy Martin had a league-best dWAR of 10.8. That’s an improved of 22.3, 50 percent higher than the runner up. In fact, the 1979 club is the seven-worst fielding unit in baseball history and the 1980 club third-best. Yowza!

The biggest defensive decline goes to the 1993 Padres. They were a good-fielding team in 1992, but their mark dropped by 17.4 in 1993. Their dWAR that year was –11.5, one of the worst scores of the decade.

We already covered how the 1991 Braves had the biggest pitching improvement (runner up: 2004 Rangers), but what’s the biggest decline? The 1968 Reds, who flopped by 26.4 pWAR from 1967. The real story here is the 1967 Reds, which WAR calls the third-best staff in history, at 32.0 pWAR.

In 1967, their fifth starter had an ERA+ of 99. A year later, that would’ve been second-best. Last year’s Phillies have the second-biggest decline by a staff. Then again, the story is the 2011 Phillies, who WAR calls easily the best pitching staff in baseball history. With 37.4 pWAR, they’re four-and-a-half wins over anyone else.

As it happens, a 1960s Reds team also has the biggest one-year decline ever in offense. The 1966 Reds were 23.1 oWAR off the pace of their 1965 club. Two things happened. First, the 1965 Reds infield is one of the all-time great (and overlooked) lineups in history. All eight starters and the three top bench players had an OPS+ in triple digits, and they were almost all 120 or higher. While the talent was great, they were also over their heads. That was the first problem.

Second, the team figured they had batting to burn, and so Cincinnati traded a bat to Baltimore for a starting pitcher. That bat was Frank Robinson, who won the Triple Crown for the 1966 Orioles. As it happens, their offense crashed completely, with only one starter posting an OPS+ over 103. Fortunately their pitching took off in 1967 … before crashing in 1968. Lots of Reds fans probably had whiplash after all that.

The biggest batting improvement belongs to the 1978 Milwaukee Brewers, which is right when the Harveys Wallbanger bunch began taking off. (The second-best improvement, incredibly, comes from the 1968 Reds. I told you 1960s Reds fans got a bad case of whiplash).

9. Best/worst

All this leads to the real question: what does WAR consider to be the best- and worst-fielding teams of all time.

Well, we already noted that the 2012 Phillies came in first on the mound with 37.4 pWAR. The worst club ever is the 1890 Pirates, whose –17.5 pWAR narrowly edges the infamous 20-134 1899 Cleveland Spiders’ –17.2 pWAR mark. Aside from them, the worst mark ever belongs to the 1995 Giants at –10.4. Gee, imagine how those Giants would’ve scored if 18 games hadn’t been wiped out by the strike.

The worst-fielding club is a 20th-century squad: the 1974 Cubs, at –14.4 dWAR. If you ever wonder why Rick Reuschel scores as well as he does in WAR (68.3 WAR, better than Jim Palmer, Bob Feller, or Juan Marichal), there you go. The 1991 Braves score as the 25th-worst fielding team of all time. Yeah, something is really wrong with that.

The best fielding team of all time, in a landslide, is Earl Weaver’s 1973 Orioles with Mark Belanger, Bobby Grich, and Brooks Robinson manning the infield. Their 13.6 dWAR is two wins better than the runner up, the 1969 Orioles. Man, those early Weaver clubs sure could field. (This also explains why Palmer rates lower than Reuschel in WAR). Those are the only clubs atop the 1980 A’s.

Only 10 clubs have negative offensive WAR. The worst is the 1914 St. Louis Federal League squad at –3.3 oWAR. The only team since 1920 with negative offensive WAR is the 1981 Blue Jays.

The top three offenses of all time are Babe RuthLou Gehrig Yankee teams. Here they are:

Team	        oWAR 
1931 NYY	46.0
1927 NYY	44.2
1930 NYY	43.1

Then comes the 1976 Big Red Machine.

10. Most extreme teams

One last bit to get us to 10. What are the most extreme teams ever, those that rely most on one facet of the game?

For this, we have to limit it to teams with winning records. Otherwise, you get really horrible teams dominating the list. After all, if a club has negative WAR overall but is positive in one category, that club relies on that one item in a way no good team possibly can. Also, we’ll toss the 19th century because it was a very different game, and that era would end up dominating the list otherwise.

Among the 1,232 teams from 1900 onward with a record of .500 or better, the club most reliant upon pitching was the 1977 Cubs. Their arms accounted for 27.1 of the team’s 31.5 WAR. Bruce Sutter became a star for this team. Just three years removed from the worst defense ever, their fielding still sucked.

One other fun fact about this team. If you adjust run support for park and league run environment, Cubs ace Reuschel had the worst run support for any 20-game winner in the last 100 years. The Cubs scored 131 runs in his 37 starts, and while 3.54 runs/game doesn’t sound bad, Wrigley Field was an extreme hitters’ park, and 1977 was a huge year for hitters. Reuschel’s 3.54 runs/game was just 72 percent of what he should’ve gotten. No hitting, no fielding … no wonder the 1977 Cubs score so pitcher-rific.

At the other end are the 1964 Braves with just 1.3 of their 38.8 WAR coming from pitchers. Somehow they won 88 games anyway.

The .500 or better club most reliant on fielding was the 1984 Twins, as one-third of their WAR came from gloves. It breaks down at the other end, though, because nearly a third of the winning 20th-century teams have negative fielding WAR. The 1981 Indians come in last, -6.9 dWAR on a team with 21.7 overall WAR. The 2004 Yankees are the only 100-win team with negative fielding WAR.

Lastly, there’s hitting. Twice a team had a winning record despite its pitchers and fielders combining for negative WAR: the 1940 Pirates and 1922 Cardinals were carried to winning records by their bats. At the other end, the 1914 Federal League Buffalo team had a winning record despite negative offensive WAR. Since WWI, the least bat-reliant winning team was the 2003 Dodgers: 4.9 of their 33.0 WAR came from hitters.


13 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Dave Cornutt
10 years ago

That’s interesting about the 1991 Braves, considering that over the past year and a half before that season, they had made a number of moves specifically to improve the defense behind their young pitching staff.  I’ll have to look up some more numbers later, but they acquired guys like Terry Pendleton and Jeff Treadway specifically to improve the infield.  And at the time Ron Gant was a very good CF.

KJOK
10 years ago

Maybe I’m missing it, but are you using Fangraphs WAR or B-Ref War?  I’m guessing B-Ref WAR might give a lot of the run prevention credit for the 1991 Braves to the pitchers, while Fangraphs WAR will give more of it to the defense.

Chris J.
10 years ago

KJOK – There’s no such thing any more as Fangraphs WAR or B-ref WAR.  As of last week, they came to an agreement resolving their differences. B-ref updated their website’s numbers in response to this agreement & those are the numbers I used here.

KJOK
10 years ago

Chris:

you’re going to have to make that ELEVEN things you didn’t know…

They only came to an agreement ON THE REPLACEMENT LEVEL.  They still calculate pitching and defensive value totally differently….

Chris Jaffe
10 years ago

KJOK.  Huh.  Thanks for the info.  That truly is something I didn’t know, obviously.

Paul G.
10 years ago

Win Shares rating the worst pitching team as also the worst fielding team is not especially odd considering how it is calculated.  The first step is to divide the total WS between offense and defense, followed by the defense being divided between pitching and fielding.  Fielding’s percentage of total defense has hard limits of (roughly) 16% and 32% of the defensive share.  With those relatively narrow limits it would be difficult for another bad defensive team to swing ahead on either end unless it was roughly as bad.  But truly terrible teams tend to be unique in their horribleness.

Keep in mind that Win Shares fielding calculations are limited to mostly basic statistics: putouts, assists, errors, double plays, passed balls, and some somewhat more advanced things like catcher CS and the like when available.  He manipulates them to get better data than you might think was possible, but on the whole this is still a relatively crude assessment.  Being able to distinguish between a very good pitching staff with lousy fielders and a lousy pitching staff with great fielders (and everywhere in between) was a bit too much to ask.  Of course, it does ask the question if WAR is any better at the fielding analysis.

David P Stokes
10 years ago

Bill James Online apparantly does still have Win Shares in their subscriber-only section.  They have a feature that lets you see a sample of their subscriber-only content, and it includes a Win Share chart for Jose Reyes.  It only goes throught 2011.  I don’t know if that means that they quit doing Win Shares after that, haven’t posted the 2012 Win Shares yet, or just haven’t updated the samples available to non-subscribers for a while (I’m guessing the last one). 

It only costs $3 a month to subscribe, so I’m considering it.

TR
10 years ago

When you think of WAR think about when we realized Allan Greenspan didn’t have a clue about where the economy was going.

Bob B.
10 years ago

Nice article. Some interesting stuff… And, although I’m not sure if you can get much on the TEAM level but you can access Win Shares (as well as Win Shares Above Bench and Baseball Reference WAR) at thebaseballgauge.com.

KJOK
10 years ago

Chris – you’re almost certainly not the only one…

The 1991 Braves question is a very interesting question.  My guess is that B-R WAR defense has some of the same issues that Win Shares does, in that it starts with the premise that the 1991 Atlanta great run prevention is pitching, then adjusts credit for defense.

Fangraphs WAR probably has 1991 Atlanta as a great fielding team, since its pitching calc is based on FIP, with everything else (LOB%, Other Luck, etc.) credited to the defense.

Atlanta had a 3.49 ERA but only a 3.63 FIP.  BABIP was .266 vs. league average of .281.

Rally
10 years ago

Defense WAR for the 1991 Braves is flat out wrong.  Usually, at the team level, it lines up pretty well with team DER.  But not in this case.

The problem in this case is the use of the project scoresheet hit location and batted ball type data.  This was one of the earlier years for project scoresheet, they got better and more consistent as the 90’s went on.

If I used the same Total Zone formulas used from the 50’s to the late 80’s, or if I had the better data from 2003 on, the 1991 Braves would look better.

If I had the time I’d fix it, which would require, for the sake of consistency, to redo the league.  Somebody else will have to do it though.

Detroit Michael
10 years ago

Yes, the Bill James Online website does maintain Win Shares data for subscribers.  For example, Jose Reyes’ page shows 2012 WAR data separated by hitting, fielding and pitching (with zeros for his pitching Win Shares obviously).

Another reason why Win Shares is not widely cited is that Bill James has been in the process of revising that superstat for years and years but got stalled in the middle of the project.  So even James essentially is saying that the old version of Win Shares on his website is not his best estimate of players’ value.

I don’t understand the last sentence in item 6 in Chris Jaffe’s article.  If the 1962 Mets have positive WAR as a team, that would translate their record to something better than a .297 winning percentage.

Chris Jaffe
10 years ago

Detroit: The Mets went 40-120.  If they had positive WAR (which they did) that would mean replacement level for them would be under 40 wins.  You subtract WAR from the wins and add it to the losses.