Thursday, May 06, 2010
Fantasy semantics
Posted by Jonathan Halket at 6:20amIf you've never taken a course in econometrics, I encourage you to do so if you have the chance. Actually, any course that teaches statistical methodology will do. Even if you never want to "crunch numbers," it will teach you how to think and read "probabilistically." Since nature, life and fantasy baseball are inherently random, understanding the semantics of probability is essential, doubly so if you're purporting to offer advice.
Things basic econometrics has taught me:
1. We can still say something about coin flips even though the outcome can be either heads or tails.
This may seem obvious, but I've seen educated people argue that just because we can't be sure about the outcome (or that "the outcome could be anything"), it is useless to talk about forecasts or numbers or statistics. Obviously not. We can still talk about which outcomes are more likely than others. We can still talk about the probability of outcomes. We can still say that the odds of a heads is 50 percent and that the outcome of a fair die roll is more likely to be between two and five (inclusive) than it is to be either one or six.
2. In an ideal world, tell me everything—give me the probability of everything.
Let's say you meet God. God turns out "to play dice with the universe." He doesn't know whether Chipper Jones is going to retire midyear, but he does know the probability that it might happen. He knows his own dice.
Why ask God only how many home runs he "expects" (that is, the average amount) Chipper to hit? Why not get more information from him? What's the probability that he hits fewer than five (tantamount to asking the probability that Jones gets injured)? What the probability that he hits 10, 15, etc.? With this information, you'd have a better idea about how risky Jones is.
3. It can be very tempting to take shortcuts when writing.
Actually, I learned this writing about baseball. I like to keep my columns as simple as possible while still making my point. I try to avoid adverbs when possible (though I am rarely successful). Writing "probabilistically" without adverbs is difficult—words like "usually, probably, likely" are useful. I have the same problem with numerical information. Yes, in a perfect world, I would just give you my probability of everything when I talk about my forecasts for Chipper, but parsimony and limited attention spans demand that I give you only as much as I deem relevant and interesting.
On the relationship between "experts" and readers:
The key is trust and establishing consistency. It is possible for one expert (say, Ron Shandler) to use mostly intervals and another (say, Derek Carty) to provide mostly point estimates. Intervals are kind of nice, but they require more disclosure. It is OK for Shandler to prefer to say (paraphrasing) "Miguel Cabrera is likely to have a home run total in the 30s" as long as we know what he means by "likely"—40 percent? 90 percent? Similarly, it is OK for Carty to say "Cabrera is projected to hit 37 home runs." If Carty gives us some interval around it, too ("standard error bands" in statistics speak), then Carty's statement is very similar to Shandler's even though they've used different words. (In fact, I am just sort of rephrasing what Carty wrote about on Tuesday. My problem with Shandler's recent writing is that he forgot a version of Lesson One above.)
Readers and writers need to come to a sort of tacit understanding about language. More often than not, writers are going to give numbers for everything. If a Shandler-esque writer wants to say "Cabrera is likely to hit around 35 home runs" instead of giving lots more numbers, than he should be consistent with what his words mean. Approximately what does "likely" mean?
On arguments within the expert community:
What goes for communication between adviser and advisee goes doubly for these blogged exchanges between experts. It is very hard to champion your cause against another "expert" in a venue designed to still be accessible to the layman reader. Actually, it is very hard to do it in any venue.
I'll have more on the quants-versus-quaints (in case you can't tell which side I'm on) debate in my next article, the tenor of which has actually be very good I think. Many expert exchanges are not nearly as interesting in part because one expert will say something semi-informative but mostly substantive like "It is good to use statistics to forecast how many home runs Cabrera will hit." And then the opposing expert will say something like "Statistical forecasts are always wrong. I prefer to go with my gut."
My problem with the second statement is that it is absolutely true but totally practically false. Forecasts are always wrong, but they are still incredibly useful. Most experts, even those who haven't taken econometrics, know this to be true. The more literally accurate statement, "I project that Cabrera will hit between 35 and 45 home runs with 95 percent certainty" would be more bulletproof to these kinds of flatulent responses, but all of those numbers are superfluous to the argument. At some point it would be better if some details could be taken for granted.
If you have a question for the Roster Doctor email here. Emails in simple text with players' full names properly spelled are much more likely to get responses. Also be sure to include your league's player pool (mixed, AL-only, NL-only), number of teams, scoring format (roto, head-to-head, points, etc.), categories, whether or not it's a keeper league, and any other pertinent information.






 
I was being totally sincere when I titled my article from two weeks ago. Many, many debates really boil down to nuanced differences between the way two “sides” use and interpret the use of a few key terms.
Of course, language as wonderful and useful as it is, is woefully inadequate to capture the infinite complexity of human thought. And, the chasm between the essence of my thought and my ability to communicate it is the source of many a disagreement. When I say “likely” I perceive it to mean something very specific within the context of how I am using it, but there are infinite possible interpretations of that term, and each reader will filter my language through their own prisms to derive unique but (usually, though not always) similar comprehensions.
It is incumbent upon a writer to be aware of this, to do his/her best to minimize it, to expect it, and to be able to distinguish criticisms that address his/her fundamental premises from those that derive from a communication gap.
Some of my friends will joke with me about my being “an expert” becaue I have this column. I often reply that I am one of many of us who are highly adept at playing fantasy baseball, but that is only a prerequisite to being qualified to discuss something in a public forum. It is not the fact that I play fantasy baseball better than most that qualifies me, it is that I can write about playing fantasy baseball better than most that qualifies me. (Or at least that’s what the THT editorial staff must have felt).
This distinction is important in terms of the readers versus expert dynamic. Many readers judge the experts solely on the experts’ abilities to play fantasy baseball, asking why him and not me? Many readers are often ignorant to the fact that being good at the game is only one component of being good at being an expert.
To bring it back to the quants and geniuses discussion, a semantic issue that deserves some serious examination is what defines something as being “a model.” A big part of the debate is that Liss thinks he’s doing something very different than what Bill’s model would try to do, while Bill argues that Chris is fundamentally attempting to do the same thing, but informally and sloppily so. Well, is Chris using a model? I dunno - depends on whether you define something as a model by its intent or its means?
Apologies to Bill Phipps, the name “quants” is going to stick even though you claim [correctly so] that the term doesn’t accurately reflect the actual composition of your group.