Monday, October 13, 2008
The problem with measuring forecasting accuracyPosted by Victor Wang at 1:02am
David Gassko recently published an article on how various projection systems performed. I am not a big fan of these types of articles. This is not an attack on David or other forecasters out there. They do great work, and there is no way I could create a projection system that matches what any of the forecasters do. Rather, this should be taken as a general warning about the information we can receive from various projections.
The first step in these kinds of articles is to choose a playing time cutoff. For example, David chose a cutoff of 200 plate appearances to compare player projections to their actual results. Note that we have no way of knowing, before the season starts, how many plate appearances he'll have. A player may get hurt, or he might perform so poorly that he doesn't get the opportunity to reach the plate appearance cutoff.
This means we don't receive certain information from these players—some of the most valuable information fantasy players seek. Most projection systems match up comparably when it comes to forecasting a player's rate stats such as on base percentage or OPS. However, how useful is this if we're not sure what the chances are of a player reaching a certain amount of plate appearances?
So what information are we receiving? We learn how accurate a forecasting system is if a player plays for a certain amount of time. But what kind of players will reach that plateau? It will include players who stay healthy for at least part of the season. It will include players whom managers saw as "everyday" players before the season started. It will also include players who may not have been considered "everyday" players but performed well enough to force the manager's hand.
What does this all mean for fantasy players and GMs? If we consider these results as a measure of accuracy, the projections that we use in general will be too optimistic. We will not know what the chances are that a player falls off the map. We will not know the chances that a player performs so badly that he doesn't reach a certain playing time cutoff. We will know in general how accurate a projection is if a player plays long enough. However, we don't have a projection for the chances a player has of actually reaching that requirement!
This is what I see as one of the biggest problems in forecasting. Forecasting systems are pretty accurate right now in projecting a certain subset of players. However, it is the other subset of players, those who provide basically no value, that can cripple a fantasy owner. We can't really project what the chances are that a player will fall in a certain subset. So consider this as a word of caution as you begin preparing for your next draft or auction while using those projection systems as a guide.
Victor Wang's work on OPS has been featured in SABR's By the Numbers magazine, and was the 2007 recipient of SABR's Jack Kavanagh Memorial Youth Baseball Research Award. He can be reached via email here.