Friday, February 22, 2013
Too many stats spoil the brothPosted by Derek Ambrosino at 3:06am
I recently received an email from a reader whose shallow home league uses a 12 x 12 roto scoring system. The reader is frustrated and suggests that this system doesn’t produce a champion that reflects the best overall team or owner because the mish-mash of categories undermines the league’s integrity. Nor surprisingly, some of the previous league champs disagree and think this system is fantastic. So, the reader contacted us to help settle this debate.
Here are the categories used for batters and pitchers in this league.
Batters—R, H, 1B, 2B, 3B, HR, RBI, SB, BB, K, E, AVG
Pitchers—W, L, CG, SHO, SV, BB, K, HLD, ERA, WHIP, QS, BSV
Let me start of by top-lining my conclusion here—unequivocally, this is a bad scoring system, if you ask me. It looks more like a Bingo card than a consciously developed system attempting to address a set of underlying principles.
Yes, there should be a set of principles underlying a scoring system. Here are a couple of ideals that you should strive for:
- Stat categories should measure actual underlying skills.
- Skills that are important to helping teams win baseball games should be credited (for batters, at least offensive skills)
- Stat categories should not result in further skewing positional values; modifications from standard scoring should strive to mitigate imbalance, and not exacerbate it.
- Rostering a single player should not unduly tilt a team’s odd of winning a category or the league.
- Avoid inefficiently counting the same achievement multiple times. For example, don’t have OBP, SLG, and then also have OPS.
Knowing that these are not all always fully achievable, and that some folks decidedly prefer non cookie-cutter settings, let me offer a few thoughts on how I would modify this scoring system.
This league is essentially counting every type of hit, as well as walks. I’m not sure what the specific rationale for offering doubles and triples as separate categories —and further, counting total hits and then counting each type of hit as its own category. Why should we specifically reward a player who hits a ton of singles? So, for the combination of H, 1B, 2B, 3B, BB, and E I think there are a few possible consolidations that make sense.
Combine hits and walks either as on-base percentage, or as times reached base. Or, combine doubles, triples (in addition to homers, which will remain a standalone category) and use either sluggingpercentage or extra-base hits.
This is also a shallow league, so finding mere playing time isn’t something worth rewarding. Therefore I’d lean toward rate stats and go OBP and SLG. If this were a deeper or NL- or AL-only league, it might be interesting to go with times reached base and extra-base hits instead. Honestly, either way would qualify as an improvement.
Strikeouts are only marginally worse than any other out (and better than a double play), so I’m not sure they should be uniquely penalized—remember, outs are already penalized via any offensive rate stat, or counting stat, as a wasted opportunity. However, strikeouts and walks do represent defense-independent at bat outcomes, so I can see an argument for keeping them in some capacity. I’d suggest BB/K ratio.
Why are errors included? Middle infielders will make more errors and first basemen and outfielders fewer. Is there some sort of bias inflating the value of middle infielders that the introduction of errors is an attempt to correct? I don’t think so. Therefore, I see no objective rationale for including it among the categories.
While the homer is guilty of double-counting (it’s a hit, an RBI, and a run), it is the most important play in baseball and it is defensive independent. Let’s leave it in. Runs is flawed, but it too can stay. RBIs can stay too, but I have a feeling the league likes to have some interesting wrinkles. Since this is a shallow league, separating good players from great players is something to strive for. So, if possible, an interesting modification might be to go for RBI percentage—the percentage of runners in scoring position driven in by a player. This mitigates mere opportunity a bit and rewards those who come through most often when in such a position.
Also, to cater to the crowd that likes to be a bit off the beaten path, we can use net steals instead of raw steals.
So, that leaves our offense with seven categories—OBP, SLG, BB/K, R, HR, RBI%, SB-CS.
That's not exactly standard, but pretty fundamentally sound.
This means we have to cut five categories from the pitching side of the equation.
The first thing that strikes me is the inclusion of wins, losses, and quality starts. I’d suggest using either wins or quality starts, since the inclusion of quality starts is recognition that wins are highly flawed. I’d recommend these three categories merge into one—either quality starts or net wins (W-L).
On the relief pitcher side, I’d look to either eliminate blown saves (really one of the stats that tells us the least) or once again go the net route and combine two categories in saves minus blown saves. I’d also get rid of holds. For one thing, they are a bad stat, and for another, while I appreciate the idea of giving elite set-up men value more similar to elite closers, this league is just too shallow to ask owners to have to mine more groups of specialty players for production. (Note that using blown saves in any way without holds will hurt the value of non-closer relievers.)
ERA and WHIP, can stay. I often like to combine Ks and BB into K/BB, but since we are seeking a 7x7 league, we can measure each of the DIPS categories individually, either counting wise or rate-wise. I’d suggest K/9, BB/9, and HR/9.
I’d also drop complete games and shutouts altogether, as they are too rare and unpredictable to be discrete categories. Plus, such games tend to be really well pitched, so they are already well-rewarded across the spectrum of pitching categories.
And that gets us to seven—W-L, ERA, WHIP, SV-BSV, K/9, BB/9, HR/9.
The reader’s email concluded by asking:
I have unsuccessfully argued that there are so many categories it does not determine a statistical significant champion because so many stats are accumulated that there is a statistical "noise" created, and thus randomness is the actual "champion."
My question: Am I correct in this statement? Am I crazy? Does this system actually determine the best champion?
I can’t answer this question completely objectively. I haven’t, for example, run the rosters of previous seasons of his leagues against a set-up with a “better” scoring system to see if they produce the same champion. But, I don’t think it is a stretch to say that at the very least that the current scoring system lacks an objective rationale behind its formation and is far from ideal.
I’ve turned down invitations to leagues simply because I thought the scoring systems were deeply flawed. I can say that, on that basis, I would not accept an invitation to a league with the original 12 x 12 scoring system.
Derek Ambrosino aspires to one day, like Dan Quisenberry, find a delivery in his flaw, you can send him questions, comments, or suggestions at digglahhh AT yahoo DOT com.