Diamonds in the graph

by Jacob Muskopf
February 1, 2011

Earlier this winter, The Hardball Times offered prospective fantasy baseball writers the opportunity to compete in a Hardball Times fantasy league. Entrants wrote fantasy baseball articles, the best of which would be chosen as our winner. While we could only choose one winner to play in the league (congratulations, Dave Chenok), we had so many great articles that we have decided to publish some of the best. This is one of those submissions.

If you’re like me, you probably:

use spreadsheets to rank players for your fantasy draft
include some form of projections in those spreadsheets
have players you’ll target and guys you won’t draft unless they happen to fall far enough
identify those in part by their projected value relative to Average Draft Position (ADP)
expect your leaguemates will be doing similar things
look for strategies to give you an edge

One strategy I’ve been exploring was inspired by observations from THT’s Paul Singman in this article. This strategy concentrates on using one’s own (chuckle if your name is Newman) projections along with current ADP for the upcoming season rather than actual stats and ADP from previous seasons.

The concept is to help visualize where value picks might be available in each category during your draft and where you might want to consider reaching. If you just want look at the pretty graphs, scroll down until you see them. If you have a hunger for some nuts and bolts, read on, my robot friend.

To build such rankings, it’s best to convert rate stats (AVG, ERA, WHIP) to counting stats (xH [expected hits], xWH, xER) and generate standard scores for every player in each category to determine their projected values. This article was thorough and relatively easy to interpret for this purpose.

It helps to have at least one set of ADP values, and for this example I’ll be using those from Mock Draft Central. Once we have these, we’ll simply chart them against one another and watch the plot thicken. Since it’s a little early to use 2011 projections and ADP, we’ll look at what 2010’s numbers might have told us if I had thought of this earlier.

I have the numbers from last year’s sheets that catered to a specific league. Therefore, this example will plot the values of a pool of 108 drafted players for both hitting and pitching (a 12-team league starting nine hitters and nine pitchers) in the standard 5×5 scoring categories. For reference, here are the average projected stats and standard deviation (SD) for each.

HR: 22, 8.07
RBI: 84, 16.25
R: 84, 10.37
SB: 13, 12.01
AVG: .284, 9.17 (SD is for xH, rather than AVG; equates to roughly .016 AVG)
K: 103, 54.89
WHIP: 1.24, 7.77 (SD is for xWH, rather than WHIP; equates to roughly .04 WHIP)
ERA: 3.57, 4.42 (SD is for xER, rather than ERA; equates to roughly .19 ERA)
W: 7, 4.54
SV: 9, 13.42

Note: Starters and relievers were grouped together; to generate accurate values I believe they must be. Players with ADPs off the chart appear on the 0 ADP line.

As one might expect, most of the projected elite home run hitters reside in the earliest rounds at bottom right, with a number of moderate contributors clustered shortly thereafter.

What one shouldn’t have expected last year was to find many guys projected to launch more than 22 homers—and almost none projected for more than 26—available after about the ninth round. With only about a dozen guys cresting 30 home runs, power was at a premium.

There seems to be a somewhat noticeable slope, with a fairly even distribution on the left side of the graph. Some speedsters who could hurt you in the HR category were still clear early targets, while a projection of 14 home runs or fewer appears to be the cutoff for many players to have gone undrafted.

Not surprisingly, RBI appear to follow a pattern similar to homers, though with a more defined slope. After about the eighth round, 100 RBI-potential players were likely nowhere to be found, and value picks were probably scarcer. On top of this, a cutoff of around 50 RBI at -2 SD means there was more opportunity to be hurt here late in drafts.

Our R chart (what, not pirate-like enough?) reflects a cutlass-like slope, with very few potentially runs-damaging players being taken early. Despite this, 95-run booty probably could have been had in the fifth round and 90 runs plundered as late as the 12th.

After that, however, one should not have counted on much opportunity for parley. Like RBI, the cutoff looks to be around 50, but at -3 SD this would be an even more severe shot below the waterline to your team (aargh, much better).

And now, for something completely different…Some significant stolen base contributors could have been ripe for the picking through round 14, with helpful options available throughout the draft. One also could have aimed for the ultra-elite early picks at the risk of putting all one’s eggs in the same bucket…or basket…container of your choice.

Even some modest contributors could have gone undrafted due to a cutoff above 0 SD; there was less risk to be hurt by late picks here. What risk there was probably related to accumulating enough stolen base producers to safely distance your team from your league mates’ without sacrificing too much in other categories.

A Hardball Times Update

by RJ McDaniel

Goodbye for now.

Interesting. The 14 players projected to hit over .300 create a distinct shelf above the first four rounds. Meanwhile, a number of players who could come close to .300 exist through the 20th. It’s possible this was one of those instances where one could have gotten better value from knockoffs than spending big on brand names.

While there is a bit of a slope to be seen here, there doesn’t appear to have been a ton to worry about since our cutoff is right around 0 SD. Similar to steals, I’m guessing any risk was about making sure not to rely too heavily on empty batting average.

Because using combined standard scores for starters and relievers via xER weights the value of ERA by innings pitched, this should illustrate a more “true” value. As is often preached among fantasy circles, not many pitchers were expected to go very early, and this graph seems to support this strategy.

Projected ERA value varied quite a bit. Some positive contributors should have been free late, and the cutoff was above 0 SD. As we know, there is also greater risk mitigation inherent in better flexibility/more necessity to play match-ups with pitchers.

This graph is almost identical to that of ERA. However, projected WHIP value varied even more wildly (pun intended) and the cutoff was just about 0 SD. There was seemingly a hint of more opportunity to do harm with this stat.

Here we’ll notice a separation between starters and relievers. The cluster at the bottom left is presumably closers, likely to be drafted beginning at the tail end of round six. Most of the remaining dots form a well-defined slope, with a good number suggesting late positive potential. The cutoff here pushes toward the negative again, though, so there probably should have been some urgency to stock up on strikeouts.

Again, we have a visible starting pitcher/relief pitcher separation and noticeable slope. Because of this, one may have been tempted to observe a strategy similar to the strikeout category above. This is where it’s handy to know that wins are one of the least predictable stats and to avoid putting undue emphasis here.

Last, and possibly least depending on one’s preference, are saves. Obviously, only closers and “closers in waiting” are going to have any real positive value here. More than likely we would have fallen either in the camp that drafted them between the sixth and 18th rounds or the camp that waited to take fliers in the last rounds and/or scour the wire.

I believe these graphs, customized with your projections and league settings along with current ADP, could be a useful visual aid for 2011 draft prep. If we go so far as to pick out individual dots to identify potential value picks and overrated players, they could be even moreso, but that is a discussion for another time.

3 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Jacob Rothberg

13 years ago

Crikey! This guy makes me glad I didn’t apply for this. Anything I wrote compared to this would seem ill-researched and jejune.

Dave Chenok

A very interesting and well-researched article! I like the use of statistics. I wonder, though, whether this ends up being a complex way to analyze something that could be done more simply—and how useful the graphs would prove to be as you’re fumbling around on draft day.

The market is relatively efficient, and ADP (like share price of a stock) factors in intangibles that may get masked when disaggregating individual statistics.

Keeping the germ of the good idea, a simpler way to approach this might be to plot a player’s overall expected projected contribution given league scoring rules—ie, in a 5×5 league, index each projected scoring stat relative to a league average and sum the indices to get a projected contribution score—and to plot THAT against ADP. That would show whether, considering a given player’s expected overall contribution, they are going where they ought to. This is kind of like plotting ADP vs. Projected player rating, but by quantifying the player rating piece the assessment is clearer than it would have been with a purely ordinal ranking. And it simplifies things, as there is only one graph to look at (plot of ADP versus expected contribution).

But I really like the premise and the approach.

Jake in Columbus

@Jacob: If it’s any consolation, I had to look up jejune. Admittedly, my analysis of the graphs is less scientific than could be. I put the most effort into generating the spreadsheets, then plotted the data based on them to get the overall impression.

@Dave: Thank you. The main thought was not to use the graphs themselves during the draft, rather to identify trends to keep in mind when ranking players manually (using the standard scores as a starting point). That probably should have been clearer.

I agree, a graph of overall player values would be a good idea as well; especially for those in H2H leagues as opposed to roto. Given an available venue, I’d be happy to generate overall graphs from the data above and/or those for 2011 (in a few weeks once my updated spreadsheets are completed). Individual player assessment becomes another matter altogether though.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG