Should we be trying to predict FIP instead of ERA?by Glenn DuPaul
November 02, 2012
In the past months, I've written constantly about ERA estimators. During that time I have moved all the way from being a major advocate for more advanced/complex ERA estimators (for example: xFIP and SIERA), to my current stance, that I'm really not sure there is any use for more complex estimators. Honestly, moving that far across the spectrum was not easy, but all the numbers that I found backed the more simple estimators.
I have decided to move away from the world of estimating runs, for at least one article. Based on a suggestion from Colin Wyers, I'll attempt to find the best way to predict the individual components of fielding independent pitching (strikeouts, walks. home runs). The ability to project those elements for a pitcher in the coming season would be valuable to a major league team.
I tried a slew of different multiple regressions using measures based on PITCHf/x data and from Baseball Info Solutions, but no measure that I could find was more successful at predicting those three components than the components themselves. However, this failed research brought me to a different, much simpler idea.
I have spent the past months trying to find the best ways to predict runs. The problem there is that the number of runs a pitcher allows is affected by a lot of things outside of a pitcher's control; which is why we have measures like FIP. I'm currently in the camp of using FIP, on a per season basis, as my starting point for analyzing a pitcher's true performance rather than runs allowed or ERA.
So, in the midst of trying to predict the components of FIP, I decided instead to attempt to predict FIP. If FIP is better than ERA at describing a pitcher's performance, then why don't we try to predict FIP instead of runs?
My goal had thus changed from finding the best measure for predicting runs to finding the best measure to predicting FIP.
I took a sample of starting pitchers who threw at least 120 innings in Year X and at least 100 innings in Year X+1 for the years 2004-2012, the same sample I used for predicting runs in an earlier article.
I tested both complex and simple estimators:
- Predictive FIP (pFIP)
The r-squared tells us the percent of variation in FIP in Year X+1 explained by the estimator in Year X.
The results (n= 703)
The results of this test were fairly interesting. The two main conclusions that I drew from them were:
- FIP is easier to predict than Runs Allowed
- The complex estimators are better than FIP itself at predicting FIP
This conclusion may seem obvious to those well versed in sabermetrics. FIP has only one factor (home runs) with substantial amount of variability, while ERA has multiple factors that, on a per season basis, promote a great deal of variability (especially batting average on balls in play, BABIP). Logically, it would make sense that the statistic that is less affected by random variation would be easier to predict.
While, the fact that FIP is easier to predict than ERA may seem logical to some, I think statistical evidence to back that conclusion is interesting and useful.
In the introduction to this article, I briefly discussed my swift ideological move away from more complex estimators. Yet, based on this sample, the two complex estimators that I referred to (xFIP and SIERA) did the best at predicting future FIP.
Is the fact that more complex estimators do a better job of predicting FIP in a subsequent season than FIP itself enough evidence to back their usefulness?
My answer: Possibly.
There are a few issues with the question I raised.
First and most important: Is predicting FIP more important than predicting runs? I may be in the minority here, but I would argue that the answer to that question could be yes.
If I worked in a major league front office, I think I'd be more concerned with how a pitcher would perform independent of fielding. Because I know what to expect from my team's defense, I could combine what I knew about a pitcher's future FIP with what I knew about my defense to get a final (better) prediction of how many runs I could expect that pitcher to give up.
The second issue has to do with my simple statistic (predictive FIP). It is true that the complex estimators were significantly better at predicting FIP than FIP; however, they were not significantly better at predicting FIP than pFIP.
The r-squareds for SIERA and xFIP were higher than pFIPs, but the difference was not statistically significant (at an alpha level of .05). So, I cannot say with much certainty that this study is straw in the cap of more complex estimators.
Finally, I'd like to leave these questions with the community.
1. Should we be trying to predict FIP rather than ERA?
2. Does this study help back the usage of more complex estimators?
Any comments or emails would be much appreciated.
References and Resources
All statistics come courtesy of FanGraphs.