The Roto Grotto: average averages and comparing rate stats

Counting statistics are relatively easy to compare to one another. With an idea of how many fantasy points they are worth, how many it will take to typically earn a specific number of roto points, and how many will be available in all of baseball in a season, you can compare counting stats to each other with appropriate context.

Rate statistics are more difficult to handle because they are really two stats in one, the first a standard counting stat and the second the opportunities for that counting stat. For example, batting average is a rate statistic comprised of hits, a counting statistic, and the opportunity for hits, which is at bats.

As with other counting stats, hits can be more or less valuable for your team depending on their context. If you are one hit away from tying another team’s total on the last day of the season, then one hit is tremendously valuable. If you are far away from both the closest leader and trailer of you, then one hit will be less valuable. However, every hit is a positive event.

In contrast, the opportunity event is always a negative event, a fact that requires a bit of framing to understand. Yes, a .300 average is more valuable over 600 at bats than over 300 at bats, assuming a .300 average will increase your team average. However, the reason that is the case is because of the additional hits, not because of the additional at bats.

I could calculate both hits and at bats as a percentage of league totals, as I did with the counting stats. The problem is that a hit is a positive event that does not equal the negative event of one at bat. A batter that produces one hit per three at bats is among the best in baseball.

I can, however, still calculate the league average, and then use it as a benchmark for comparison. Here is the batting average of all non-pitchers over the last three seasons:

Season Average StdDev
2010 .261 .026
2011 .259 .028
2012 .258 .031

In recent years, league average has declined slightly. In 2012, it was .258. I also included the standard deviation of the batting averages of players with at least 300 at bats in those seasons, which has been close to 30 points in each season.

With the league average and standard deviations, I can calculate the Z-score of a specific player’s batting average. A Z-score is a simple expression of how much better or worse a sample statistic is compared to the mean on a scale of its standard deviation. A Z-score of 1 is one standard deviation above the mean while a Z-score of -1 is one standard deviation below the mean.

Here are the Z-scores of the batters that were closest to each whole deviation in 2012:

Player Average Z-score
Ryan Braun .319 2
Ruben Tejada .289 1
Mark Ellis .258 0
Mike Napoli .227 -1
Carlos Pena .197 -2

A player with a high Z-score will have a correspondingly high average. The reason Z-score is a useful statistic is that it allows you to compare different statistics on different scales. Jeffrey Gross explains it well in his article from a few years ago on his auction-pricing model. I’ll hit on a lot of those same points in the coming weeks, and I will try to apply some of those principles of draft preparation to in-season strategy.


3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
johnnycuff
10 years ago

good stuff.  looking forward to more.

not sure i understand your reasoning about leaving ABs out of the picture, though.  after some discussion about positive and negative events (can you clarify that a bit for me?) that i don’t quite follow, you come up with a method that says felix hernandez (who went 1 for 3 last year) is a better batting average contributor than ryan braun.

to measure the impact of rate stats, why not include the ABs compared to league average as well as the AVG compared to league average, similar to ERA+ or OPS+?

johnnycuff
10 years ago

cool.  this is something i’ve been thinking about and tried a few things on, so i’ll be interested to follow.

Scott Spratt
10 years ago

Hey, Johnny.  I definitely agree with you.  I probably could have been clearer in my language, but I plan to continue to address this topic and will build in an at-bat adjustment.  I didn’t want to try to cram too much into a single post since this is an extended look at a complicated topic.