Wednesday, April 28, 2010
Capacity building 101: so, tell me more about this databasePosted by Derek Ambrosino at 5:32am
In last week’s column, I mentioned the idea of major fantasy sports providers instituting a census function and keeping a database of league performances. Most of those who reacted to the idea agreed that this would be a useful tool, so I thought I’d expand on the idea a bit and flesh out what how it might work and what I think it should provide. I’ve had this idea for a while (I assume others have too), so I’ve given it some thought.
First of all, I think this should be an optional feature. A league’s commissioner would choose whether he wants to enroll the league in the program. In fact, I’m inclined to think that only private leagues should be able to opt in. My reason for this is that I want to keep the quality of the data as high as possible. The nature of this data dictates that it is most likely to be used by serious fantasy sports participants, so I’d like to have at least some filtration of the data. I want to minimize the amount of data coming from leagues where somebody drafts Matt Kemp in the fourth round and a third of the teams aren’t even rotated on a regular basis. So, my broad sweeping assumption is that private leagues are generally of a higher quality than public leagues.
By opting in, your league’s settings are recorded and the system begins banking data about rosters, drafts and scoring. When using the database, the user would just input the preferred format in a series of drop-down menus: player universe, roto vs. head-to-head, number of teams, roster size/starting positions (there’s probably more variance here in pitcher starting roster set-up than batting, so I’m inclined to just have the system not distinguish between SP and RP and simply ask for the number of active pitching slots), etc.
On a side note, I really don’t understand why there’s an option to differentiate SPs and RPs in the first place and I encourage everybody who will listen to me to set up their leagues such that all the pitching slots are simply Ps. I mean, this is a totally artificial distinction; “starting pitcher” and “relief pitcher” are not real positions. Nothing is to stop a team from having nine guys pitch one inning each, so I don’t see why a fantasy league would issue mandates that owners own a minimum number of different types of pitchers. It’s no more sensible than having slots designated for righties and southpaws. OK, guys, rant over.
Anyway, here are a few things I think the system should track and why those things might be useful for the fantasy universe to know.
Scoring. For me, I think this is the most important vein of information to be gained from this hypothetical tool. Here are a few important questions that we could gain insight into:
- How many points/what record does it take to win the average league of your size/structure?
- How is the scoring distributed? Are some categories more clustered relative to others (and the overall supply of those stats)?
- Are there patterns about the relative strengths and weaknesses of good-performing and poor-performing teams?
- If I’m aiming for the 10 out of 12 across the board strategy, what benchmarks should I be shooting for per category?
Player ownership. Perhaps there isn’t a much to be gained from learning things like which players were most often found on championship teams and who was found on losing teams, but paired with some draft information it could be worthwhile to know these things.
Draft info. Some owners’ player acquisition strategies are driven very heavily by positional scarcity. Is, for example, forgoing first basemen earlier in the draft in favor of middle infielders a strategy common to winning teams? Of course, by establishing this database, this would allow the providers to publish their own ADP data.
There are some problems with this proposal, I’m aware. One of the main questions is whether there are too many junk leagues that will muck up the data. This is a question I’m not really sure about. I do feel like the majority of people who I meet randomly and start talking about fantasy baseball with seem to have no idea what the hell they are talking about. (If I had an agent, I assume he’d advise me against sharing this opinion with the public, as a fantasy baseball columnist.) A minimal step to try to improve the quality of the data would be to keep public leagues out of the process, but beyond that, I’m not sure how to “gatekeep.”
Another potential problem might me the myriad subtle variations of league and roster structures that could slice the data really thin. Again, I’m not certain this is a substantial problem, just that it has the potential to be. I think negating the distinction between SPs and RPs is a good first step. Perhaps you could ignore distinctions in bench spots and only focus on active roster spots, if the data started getting cut thin. But these would be kinks to work out once the evaluators get to see what they are working with. At the very least, it would be interesting to see how popular different league formats really are.
Got what you think is a fairly easily implemented idea for a valuable tool to advance the analysis of fantasy baseball, or suggested additions or criticisms of mine? Let’s hear ‘em.
Derek Ambrosino aspires to one day, like Dan Quisenberry, find a delivery in his flaw, you can send him questions, comments, or suggestions at digglahhh AT yahoo DOT com.