Most debates really boil down to semantics

So, the fantasy universe, or at least a small subset thereof, has recently been somewhat abuzz over debate between Rotowire/RotoSynthesis’ Chris Liss and some of the poker pros/options trader folks in the CardRunners league. Derek Carty has already graciously offered as series of links that allow us rubberneckers to brief ourselves on the debate. Always up for sticking my nose where it doesn’t belong, I’d like to offer my take on things and fan the flames like Suge Knight at the Source Awards.

Essentially the debate comes down to this: Bill Phipps and his ilk think that the fantasy gaming community has not optimized pricing of stat lines. Chris Liss feels that even the best projection models are so inaccurate at predicting such stat lines that the ability to translate those stat lines into dollar values with perfect accuracy is either minimally advantageous, or the source of illusionary confidence, which only serves to reinforce the cognitive dissonance that bad projections can somehow be corrected by superlative valuation. Liss feels there are too many inputs and the game is too organic and the fantasy draft too dynamic for a model to be ideal. Instead he prefers to trust his ability to “thin slice” on the fly after copious preliminary cramming sessions. Phipps (and Carty) fail to understand how optimizing an additional tool couldn’t help. Of course, Phipps and crew have not revealed their model, so any assertions about what inputs and variable it does and does not attempt to address are presumptuous.

I actually think that this discussion is not as complicated as it seems to be. But more importantly, I also think it is a bit premature, as we lack the tools to really test either approach under the rigor of either the scientific method or the weight of tens of thousands of repetitions. Finally, I think the question of who is right may actually depend on how one defines success. There’s also some cultural baggage involved about the “expert” archetypes that is probably at play here too.

So, let me get to work at jumping into a gun fight among strangers with a pocket knife. I think I’ll make my points in the form of questions. (It seemed to intimidate Jeopardy contestants to the extent that the format is no longer even followed!)

Does it confer an advantage to have a tighter model for converting stat lines into dollars?

This one seems self evident—of course it does.

Liss argues that the advantage is marginal because the inputs are flawed in the first place and, perhaps even more importantly, the environment in which you use them is dynamic. The available player pool (and ergo the supply of different stats from different positions) changes.

To me, neither of these counterarguments actually refute the premise that it is ideal to have as accurate a translation tool as possible. Liss argues that he doesn’t need a translational tool because the whole process is fluid, like speaking a language in rapid, unscripted conversation. Well, I don’t fully buy that.

I have a fairly robust and unconsciously competent command of the English language, but that doesn’t mean that I never have trouble expressing myself as accurately and articulately as I wish, especially within the context of rapid, unscripted dialogue. And I wouldn’t dream of sitting down to write a dissertation without the full of arsenal of reference books (translation: aids) at my disposal.

A tool that can accurately reflect the value of a composite stat line is valuable, even if that stat line has a component of uncertainty and that value is set in a vacuum that doesn’t fully reflect the evolving dynamic of the draft room. The point to remember here is that this is just a starting point. Liss’ point about the dynamism of the draft process is important, but all that means is that if you want to optimize your odds when playing blackjack, you should know raw odds of your hand beating the dealer’s (partially hidden) hand and should be counting cards, too.

Sure, fine, so how does having an accurate pricing model in a vacuum preclude Phipps from counting cards as well, and modifying the model to reflect the ever changing supply? It doesn’t. Owner A just took somewhere between 55–75 stolen bases out of the pool by buying Ellsbury; all remaining lines therefore get adjusted. You could either do this in your head, or through a computer. I fail to understand why the human brain has any intrinsic advantage at modifying value of remaining players on the fly.

The real question here is whether the owner using the computer model is aware of how reliable (or unreliable) the projections are and has sufficiently corrected for that in his model.

How much advantage does a better conversion/translation tool confer during the draft?

Well, this is the question of whether Phipps is a good card counter or not. Or, to add another analogy here, having the best feel of what raw materials should cost will not necessarily make you the most efficient builder.

So, in addition to the elastic supply and demand in a fantasy draft/auction setting, one must assemble value in the correct way. The efficiency of correct pricing can quickly evaporate if you make errors in estimating how much of each material you need to build your structure. In a fantasy league, margin of victory is meaningless, so the holy grail is to get maximum value distributed with maximum efficiency, which is to win every category by a single unit. However, more likely is that you wind up with surplus bricks but are woefully short in mortar; at that point is doesn’t matter if your whole lot of materials is appraised for more than anybody else’s lot, or if bricks are more valuable per unit than mortar. The surplus bricks are worthless to you and unless you can turn them into something else, you lose.

The point here, and I don’t think either of the parties would dispute it, is that even with better translational skills, other knowledge gaps can drastically mitigate the (marginal in the first place, according to Liss) benefits derived from optimized pricing. To say this another way, to retain the advantage of the optimized pricing model, either the model must have the capacity to process this dynamism built in, or its user must be able to thin slice these developments as well as Chris.

A Hardball Times Update
Goodbye for now.

To be fair to Liss’ argument, it bears emphasis to repeat explicitly that he does not fully deny the advantage of optimized pricing, though he is skeptical. He just thinks that such an advantage is small in relation to the other knowledge areas from which one could derive an advantage. And that’s why the poker pros aren’t yet ready to compete with the fantasy pros; they’re scraping the margins of the math game, while the “genius” drafters are honing their ability to predict sea changes in specific commodities. And, frankly, in an important sense, I agree.

How do you beat an expert gamer in a math game?

Let’s start with one basic premise of game theory and one fact about computer programming, neither of which are fields in which I’d consider myself an expert. Game theory dictates that you never want to alter your play in a manner that will cause a poorly playing opponent to react in a way that will probabilistically improve his play. Computer programmers note that one of the most difficult things to program a computer to do is to generate random numbers; what often appears to be a random string is really just a very small string within a much longer string, for which a pattern exists.

And, while we’re at it, let’s throw in the good ole Voltaire quote: “The perfect is the enemy of the good.”

Now, I will offer my theory about how I would go about playing one of these poker pros, in poker, were I given the opportunity. I’m a decent card player. I don’t play at casinos much, but I have a fairly good grasp of math, probabilities, and can do fairly complicated computational math in my head … even while drinking scotch—alcohol tolerance is an unheralded skill for the informal poker night warrior. Basically, I’m just good enough to know how badly everybody else at the table is playing and complain about it. Yeah, tons of fun, I am!

But I would not attempt to play like this were I to play out a discreet trial with a table of experts. By playing my best, all I would be doing would be to ensure that I am playing an inferior version of the game that they are playing. Instead, what I would do is to try and make myself seem like a bigger wild card while still retaining a semblance of objectively correct strategy, Sure, I’m mitigating my own capabilities but mitigating my opponent’s even more because he/she has more to lose by minimizing things like the ability to predict my hand by the betting patterns I make.

To continue the strained analogy and bad metaphor theme, what I’m doing is attempting to turn this basketball game into a 3-point shooting contest, which is the way Cinderellas most commonly knock off high seeds. I may think I’m actually a better post player (the percentage play) than I am a 3-point shooter, but the gap between my post skills and my opponent’s is greater than the gap in our 3-point shooting skills.

In a way, what Chris is saying is don’t leave me open uncontested from behind the arc because I will knock ‘em down all day if you don’t put a hand in my face. I’ll take the vig (lower percentage shot) for the trade off that I get to pick my shot and get open (you don’t go the extra dollar), and if I’m shooting as well as I normally do, you’re going to have to make a whole lot of turn-around seven-footers to outscore me. Further, you’ll actually miss some of your twos as well because I’m defending the paint while you are not even bothering to defend the line.

Liss’ argument is that the fantasy pros inside game is good enough that the quants don’t have room for a huge advantage, meanwhile Liss and crew will easily out shoot the poker pros from 3. And, as tempted as I might be to put my money on a single quant versus any of the single geniuses, when you break them into fields, I think the odds are that ones of the geniuses wins more often than not. More on this in a minute.

How do we judge who performs the best in a fantasy league, and what is the goal when developing your team?

Now, we get to my real question.

Two of the analogies mentioned throughout this debate were chess and stock market. These were chosen as examples to reflect subjects for which processing power was the linchpin in figuring things out and where the inputs were just so numerous and diverse that a model was nearly impossible to build at all. There’s one other difference between chess and the stock market, one which fantasy baseball actually marries though, resulting in a difficult to resolve question.

In chess, you are playing a one-on-one game, in which you must simply beat your single opponent; there’s one winner and one loser. To succeed in chess you must win way more often than you lose. In the stock market, you are really competing with the field to call yourself successful. You don’t ever need to have the biggest day, or week, or month of any other trader, you just need to win more than you lose and be consistently profitable over time. In fantasy baseball, you play against the field, but there is only one winner. This dynamic affects the appetite and rational tolerance for risk.

I don’t think a model-based approach will necessarily be risk-averse. In fact, I think a good model will aim to be risk appropriate. But, as long as a quant is competing against a field of “genius” sharps, it seems plausible that nearly every season several of the geniuses will take on what objectively derived and rational models will deem too much risk and one or more will hit on a bunch of those picks and win by outperforming the market. My biggest fear about the quant approach is that it’s a path to being a perennial runner-up.

I have no doubt that the quants will “get their money in good” with high frequency even right off the bat. They could do this just on the strength of math even if they didn’t know much about baseball. But that doesn’t mean they will win the league outright with any consistency. If you are playing against an opponent who sees a second-place finish and a last-place finish as the same exact thing, how do you consider that in a model?

I question whether the genius drafter and the quant are actually competing strategies, or discreet paradigms, one being a road to perennial contention but smaller margins for over or underperformance, and the other being more volatile in terms of range of outcome, but more anecdotally successful. And this is the question that begs the elephant-in-the-room meta-point; how do we judge success in the fantasy baseball arena?

If Liss and Phipps were to play out 20 seasons and (pretending that is a statistically significant sample size), Liss has five championships with an average finish of 3.8, while Phipps has only two championships but an average finish of 3.2, who is the better player? Phipps has higher batting average, Liss the better slugging percentage.

Until we answer that question, I’m not sure we can form viable opinions on the relative merits of the genius and quant strategies. Perhaps some insight lies in what the market wants out of its experts. From whom would you rather take advice, the guy who consistently exploits the market inefficiencies and beats it on the margins, or the guy who swings for the fences and connects more of than the other sluggers but still makes more wrong decisions than the consistent margins guy? Rationally, I think we want the quant. But, culturally, I think we romanticize the genius.

Back to the topic at hand for a second, I think one of the intriguing questions here is whether the quants remain fully agnostic as they nurture their genius tendencies. The poker analogy is kind of like reading players: “You know, I think that guy is bluffing and I can tell not by his betting patterns, but by his body language.”

Certainly, card-playing quants are open to integrating that form of insight, so why wouldn’t they be open to saying, “You know what, Justin Upton has shortened his stride this preseason and it is really helping him handle those pitches on the outer half, and since the statistical projection model doesn’t factor that input, I think his baseline is actually his 70th percentile season based on their projections, so I’m going to bump the price I’m willing to pay for him.” (Totally made up scouting evaluation by the way; I know nothing about Upton’s stride length.)

Anyway, my biggest question in this overall debate is how are we to know who is right? Certainly, who wins this single league—one trial that takes seven months to complete—is not really telling of anything. How many times would quants and geniuses have to play out a single season before the results have meaning? So, until we have simulators that can simulate the minds of five guys like Phipps and five guys like Liss, play out thousands of drafts before a single season and mimic their in-season managerial styles, how do we separate luck from skill? Even further, we’d then have to do the same year after year to determine whether the trends in the first set of trials were due to variance, and if so whether there’s any trend within the variance that either player may be consciously or unconsciously exploiting.

I believe it was Mike Podhorzer, who popped in on Derek Carty’s article, who once wrote a piece back at Fantasy Generals about what constitutes an “expert,” and performance was not one of the metrics he used, quite correctly I think. Sure, an expert will outperform the mathematical probability of winning his league over the long haul, but that bar is low and the trials one completes, even in a lifetime of fantasy gaming, are relatively few. If I play a 12-team league for 24 years and win three times, do I pass that bar? Was that my skill or luck? Instead, Mike focused on factors like intellectual independence, internal consistency of reasoning, etc. as criteria. So, while the genius vs. quant debate is fascinating, let’s remember that experts exist in both camps and that a few seasons worth of anecdotal performance will provide very little insight into the relative merits of the approaches.

There is one tool that I wish would be developed that would help advance our ability to test some of our theories, or even just to add perspective. I wish Yahoo, ESPN, CBS Sportsline and the other main fantasy sports providers adopted a census option of sorts that would feed and build a database. When you set up your league, you can choose whether you want the data tabulated as part of the census, and what that feature would do is record your league’s settings and bank its results with leagues with identical settings. So, therefore you would develop a database of mixed 14-team leagues with this exact roster structure. Users could then search that database to find things like what categorical benchmarks you’d have to target to aim for 11s across the board.

At one point in the quant vs. genius debate, the question came up about the stratification tendencies of categories. Do home runs tend to cluster relatively tighter than steals? I don’t know; I can only look back at my past leagues’ standings and guess. But, if I have access to the aggregate data of hundreds of thousands of leagues played with the same settings, it’s much more likely that the trends that emerge are meaningful.


14 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
db
13 years ago

It seems to me that the best way to win in fantasy is to recognize players who can significantly exceed their projections and avoid the ones who will disappoint.  I don’t believe that a slightly better model in player pricing can substitute for the key portion, player analysis.  Overally, based on the wisdom of crowds, the general pricing or live draft results probably do as good a job of valuing as a refined model.  The experts are more likely to figure out whether Nelson Cruz or Josh Hamilton or Shane Victorino is a better risk for the same price and adjust accordingly, even without knowing an exact value of a steal or a home run.

Mike Podhorzer
13 years ago

Fantastic, fantastic article. But I did want to point out that I believe the article you thought I authored was actually by Patrick DiCaprio. We share the same philosophy though, so I would agree that one should focus on the process, as opposed to the results.

I disagree with db above that the “general pricing and live draft results” do a good job of valuing players. Every year there are numerous players that are under or overvalued due to their values being based almost solely on the previous season’s performance. I also believe that when one moves further away from the standard 12-team mixed league with 23-man active rosters, pricing becomes more and more inefficient, at which point a model would generate an increasing advantage.

Though I agreed with the majority of what you wrote Derek, there were a couple of things I wanted to comment on. I will do so when I get the chance…

this guy
13 years ago

I’m not reading this long post. The title is very true.

Your debate boils down to league efficiency. The higher the efficiency of your league, the greater the required degree of precision.

I doubt there is a league on earth where a $1 degree of precision is required. I don’t think you can find 10-14 people who are smart enough to require this, who are willing to spend their time on fantasy baseball.

Matt Levy
13 years ago

Derek, great article.  Your analogies to card-playing, chess and the market are spot-on and completely validate your arguments.

I particularly love the idea of the big sites tracking what similar leagues to and quantifying some of the information out there.  At the very least, it would be interesting to see.

Rudy Gamble
13 years ago

I second the idae of having the large sites providing analytics on their leagues.  We’re sponsoring about 25 leagues this year and hoping that will provide the basis for some directional analysis – particularly comparing the expected finish (based on predicted stats) vs. actual finish and whether there are certain players that are more/less likely to be on a championship team.

Nathan Smith
13 years ago

This is almost certainly the gambler coming out in me, but your last question is one reason why fantasy baseball is best played for money.  Then we have an easy way to compare whether five wins and an average place of 3.8 is better than two wins and an average place of 3.2: who won more money?

Derek Ambrosino
13 years ago

Nathan,

I agree that fantasy baseball is best played for money. In fact, I’ve argued many times that, like poker, the game can’t actually even be played with any integrity unless it is played for money.

But, the thing is, you could set up pay scales that would reward either type of play, so the respective takes shouldn’t answer the question, so much as the underlying philosophy about what type of success to privilege should dictate the structure of the pay scales.

Nathan Smith
13 years ago

Derek,

But now I’m no longer so compelled that this is properly a problem.  We can set up leagues with various rewards; it’s not at all clear that some are straight-forwardly better than others.

See, now the question isn’t ‘How do we judge who did better in a specific situation?’ but ‘How should we set up the rules that determine how well each person did?’  There are bad ways to answer that question; here’s one: distribute the prize money equally among all competitors, regardless of finish.  But nobody does that. 

Among the prize-money distributions that people actually use, I can’t see very good reasons to pick one over the other beyond personal preference for certain considerations.  For example, winner-take-all has a certain purity that appeals to me, but I don’t think that’s any kind of objective reason to always play winner-take-all.  The question is analogous to “should people play chess or poker?”  The answer seems to be, some should play chess, some should play poker, some should play both.  Why can’t we give that answer here?

Aaron
13 years ago

The pay scale methodology would be able to measure individual managers who were highly successful (given a statistically significant number of trials) but not tell which strategy is better since the payout structure will inherently reward one strategy more than it rewards others.  Perhaps that’s another way of saying that there’s no way to tell whether a quant or a genius is “better” because there is no standard for fantasy competition.  There are points league and rotisserie and keeper leagues and redraft leagues.  Just like there is limit Hold ‘em and NHL and PLO and Omaha 8, different players are better and different strategies are closer to optimal for different games. 

I guess I’m really just agreeing with the main premise that until we decide how to measure success in fantasy baseball how can we really argue about the optimal way to achieve it?

Opening up the databases at the main hosting sites would be a huge first step, though.  Anyone start a letter writing campaign?

Aaron
13 years ago

What about this, create 16 standard teams, either computer generated or with specific traits in mind, i.e. team A is weak in speed, team B punts saves, etc.  Then make the teams public and try to get as many leagues as possible to use those standard teams.  Managers would auction for the entire team instead of player by player and whatever is left after the auction becomes their free agent budget.  Given a large enough sample size we’d immediately be able to measure what is valued by managers since they just paid for it and we’d also be able to measure what pays off in the real world.  If the Save Punters end up winning 18% of the leagues but the Weak Speeders only win 3% that’s a pretty strong indication that saves are less important than speed. 

While we can never run repeatable trials for baseball standardizing the starting conditions would certainly be a good step forward.

eric kesselman
13 years ago

I’m a big believer in some sort of scoring system. I don’t think it matters if its money or if its points, as long as people all compete as if its a lot of money smile

I agree the question becomes ‘how should you set up the prize structure?’ but I also don’t see problems with that as long as its a reasonable approach. You would also inevitably have leagues with different scoring systems, and if the same people tended to win in all of them you can start to say ‘they’re good’ instead of ‘their strategy just happens to be well suited for this prize structure’

Either way, having something quantifiable is always a good first step when you want to argue over valuations of performance.

Derek Ambrosino
13 years ago

Eric,

I hope I don’t botch this point too badly, though I can see it coming… The question your comment raises, IMO, is when does a quant become a genius?

It seems to me that the whole quant philosophy is geared toward the more equitable pay out structure – you do a good job, you should do okay for yourself regardless of whether some dolt or super risky bettor wins the lottery. Meanwhile, I think the genius philosophy is better suited for a structure closer to winner-take-all… and that’s partly why it’s romanticized.

Once you can value each unit of production precisely, the question then becomes who can predict better and who can get better return on their investment. So, this becomes something of a balance of risk vs. reward. If you optimize your risk, based on the model, but somebody else takes greater risk, do you have enough potential reward quotient to triumph if the riskier bettor makes good calls? …That is, must you adapt your model for the fact that others are going to take “too much risk” and therefore have to take more risk than the model dictates to protect yourself against the fact that if any one of those genius drafter, higher risk types, “gets hot,” you’re going to lose?

…And, if you do, do you cease to be a quant, regardless of how you arive at your selections? Say you realize that you must take extra risk to protect against the “hot bettor” sharp. Then you’re taking non-optimized risk and even if that roster is determined by a (quant) model, you’re playing the genius game.

So, again, I ask whether a pure quant, or a model based on pure quant principles, can see a second and last place finish a equally (un)successful? You may say, duh, sure – the remedy for that is to build a model that will alter risk tolerance such that it operates to build teams that will win as frequently as possible without regard to margin of defeat when you lose. But, then, my question would be, aren’t you then just approximating (or automating) the genius?

As I alude to in the article, the other option would just be to say, that 8 2nd place finishes and two 3rd place finishes in a ten year stretch is more impressive than 2 wins and 1 3rd and 7 no-shows and leave it at that. You may win more often, but I’m still better…which I think is also a completely valid way of looking at things. For example, which team had a better decade of the aughts, the Twins or Marlins? (How does it change the question if I subtly rephrase it as: which team was better throughout the decade?)

Semantics is a mofo. Father of fantasy baseball… Alfred Korzybski!

eric kesselman
13 years ago

I feel this is an objection commonly made, and I don’t think its accurate. Every good gamer I know (and I know quite a few) first figures out the rules and scoring, and then comes up with a strategy. Being wedded to any particular approach regardless of the rules is basically idiocy. I play very different starting hands when playing limit hold em than I do when playing no limit hold em.

I don’t really see why a quant can’t be a genius as well. There’s nothing stopping me from saying player X will have a breakthrough this season, and pricing him accordingly. Whatever camp you are in, you’ve still got to make some guesses about players.

The difference between the quants and the intuitive players is really about methodology. We believe a scientific, and mathematical best guess is the most useful way to approach problems and their view (in its strongest form) is that their study and experience will guide them to the right answer. Personally, I think thats unlikely and whatever edge this group has in reality comes from its ability to identify $27 players out of groups the world thinks are $20 players. Or $15 players.

To answer your question in another way- what you’re really asking is ‘what if you need a lot of volatility to win? Can Quants deal with that?’ If everyone is taking big gambles, and SOME of them are going to pay off- wont the inevitable winner be someone who took on a lot of risk? Maybe! But what approach do you think is best suited for figuring out how much volatility is optimal, and how much to pay for it? Since this is basically exactly what options trading is about, I suspect the options traders/quants have the right methods here.

Nathan Smith
13 years ago

Eric,

I think one way of understanding the point you’re making to Eric is that, insofar as there is an agnostic/genius distinction, it’s a distinction that’s orthogonal to the quant/intuitive distinction.  One can just try to maximize value relative to the marketplace through intuition or through quantitative analysis, and similarly one can try to pick one’s guys and focus on them by using quantitative analysis of which guys are seriously undervalued by the marketplace, or by intuitive grasp of which guys are likely to outproduce expectations.