Editor’s Note: This is the third post of “10 Lessons Week!” For more info, click here.
How did ZiPS come about? The genesis of what later became ZiPS stems from conversations I had over AOL Instant Messenger, with Mets fan, SABR social gadfly, and pharmaceutical chemist Chris Dial during the late(ish) 1990s. I knew Chris from Usenet, a now mostly-dead internet distributed discussion system.
Usenet was my introduction into the wider sabermetrics community, full of lots of other names you would recognize, like Keith Law, Christina Kahrl, Voros McCracken, Sean Forman, and scads of others. Chris and I talked about making a basic projection system, that had results the public could freely access, that did 95 percent as well as projections hidden behind paywalls. The conception is similar to what Tom Tango later independently developed and coined Marcel.
Nothing came of that at the time. I didn’t revisit the idea of doing a projection system until after the turn of the millennium, when I was regularly writing transaction analysis for Baseball Think Factory, a startup of Jim Furtado and Sean Forman that I had been involved in since its conception in 2000. While I majored in math back in college, I was never much motivated by it unless it could put to use making me money or analyzing sports.
I had financial flexibility at the time due to the former preferred application of math, so I had the time and ability to put together a projection system. There wasn’t any eureka moment that led to the creation of ZiPS–I didn’t fall asleep at a game until a baseball fell on my head from a Barry Bonds tree–it just seemed like a practical thing to have when analyzing transactions.
What started as a basic projection system ended up as something much more complicated. I had the idea to incorporate some of McCracken’s DIPS research into the mix, which is the reason I named it ZiPS, in honor of it. I actually intended to call it ZiPs because CHiPs was my second-favorite show as a child (behind Dukes of Hazzard), but I mistyped it as ZiPS when it finally debuted at Baseball Think Factory.
Jay Jaffe had noticed it and gave it a plug at the time–the first mention of ZiPS in media–as ZiPS, so ZiPS it remained. It originally was going to be SiPs, but that kind of sounded like some Scandinavian bottled water company or some kind of juice package for kids that is impossible to open, like those stupid Capri Suns. Note, none of this story is made up; I really am that ridiculous.
The initial build of ZiPS was still fairly simple relative to today’s version. It only used basic stats and had generic aging factors. As time went on, it became more complex. Various studies yielded more information on modeling various stats like BABIP, and the generic aging factors evolved first into 12 aging factors (representing general player archetypes) and finally into generating estimated aging curves on the fly using cluster analysis.
ZiPS isn’t the product of genius, it’s the product of a lot of work at the right time. One of the real smart guys in sabermetrics, people like Sean Forman or Chris Long, could’ve put together a ZiPS-like projection system much more quickly. When someone asks me how I got to write about baseball for a living, such a cool job, I’m almost embarrassed to admit that I connected with the right group of baseball nerds in the right place at the right time.
When I was a kid, my grandfather bought me all the Baseball Abstracts, the Elias Baseball Analysts, and a subscription to Sports Illustrated starting when I was six. (I was always a baseball nut; most of the pictures of me up to age five or six had me wearing an Orioles cap.)
I was smart enough to realize very early on that I was not destined to be a major league player, so I grew up wanting to be Bill James or Peter Gammons. Now, nobody could actually be Bill James or Peter Gammons other than the originals, but I feel very fortunate and blessed. Along the way, I’ve learned a few things. Let’s jump in, shall we?
Lesson #1: Developing a Projection System is a Lot of Work
Edison once commented that genius was one percent inspiration and 99 percent perspiration. Putting together a projection system is about 110 percent perspiration, leading to negative inspiration, and totally breaking how the math works. At a fundamental level, you’re putting together an insane amount of data in order to come up with an objective estimate.
While a lot of people start with Marcel as the base and make changes from there, ZiPS pre-dates Marcel, so a lot of the questions involving what data has predictive value and the weights to assign to various factors were things I had to do original research for. ZiPS is essentially the sum of literally hundreds of mini-studies I’ve done on various issues.
Lesson #2: People Overrate the Odds of a Player Improving
Even among many who are into the sabermetric side of baseball, there’s a belief in a neat, tidy, aging curve for players. It’s nowhere near that simple. While you see this pattern in the aggregate, especially for hitters, nothing comes that easy. Many minor leaguers, even those of prominent talents, simply don’t improve past where they are 21 or 22, even at the higher minor league levels.
People also have an idea that a superstar at 22 is going to be even better at 27, but again, that’s not true, especially to the extent it may be true for a 22-year-old still putting together his skills. While a random 21-year-old is preferable to a random 25-year-old of similar abilities, the very young high achievers tend to plateau once they hit stardom.
Willie Mays never was significantly better than he was at age 23. Alex Rodriguez didn’t have a traditional 27-ish peak. Neither did Mickey Mantle or Ted Williams, and so on and so on. Mike Trout‘s going to be an unreal player when he hits 27, but he’s unlikely to be in a different tier of craziness than he is/was from 2012-2014.
Lesson #3: Historical Data Sucks
We have all sorts of new, exciting information that has been collected about baseball players over the last 30 years, but unfortunately, we’re unable to go back in time and get this kind of data from past players. We can make estimates of data, such as Sean Smith‘s awesome Total Zone, which is one of the best attempts to kludge some good data from a time when good data wasn’t available, but so much information about players is lost in history.
As someone who spends a lot of time trying to wring some understanding out of data, it kills me how much stuff we don’t know about the past that we will never truly be able to know. And even data that theoretically should be easy to collect over the years is frequently sullied by poor record-keeping. Want to include height and weight in your projections? A lot of those numbers are fiction. Anyone believe John Kruk played at 204 pounds (FanGraphs) or 170 pounds (Baseball-Reference)?
We can make a not-completely-terrible model of how fast players were through some of the primary and secondary statistics available, but being able to model reasonable guesses as to what the data are is not the same thing as having the actual data in your hand.
Lesson #4: There’s a Lot We Still Don’t Know About Advanced Statistics
We have a lot of new data with the various f/x incarnations, and while these are cool, we are still very early in understanding the predictive value of some of these new data. Some things we can get out of the statistics, such as the general value of fastball velocity on a pitcher’s expectations, but there are a lot of lessons in there that we just don’t know yet and won’t know for another 10 or 20 years. Much of what we attribute predictive value to, among things like swinging -trike percentage, is still a set of educated guesses at this point.
Lesson #5: We Could Do a Lot More With Better Minor League Data
While the amount of data available on major leaguers has been improving continually, especially over the course of the post-Moneyball era, there’s still a lot of information about minor leaguers that isn’t regularly available to the public. The state of proprietary data on minor leaguers is a little better, but it’s still not at the level of what’s available for major leaguers.
Keeping some of the good data proprietary essentially blocks out some of the next generation of big baseball thinkers from making the next breakthrough. Even something as basic as minor league splits was really difficult to come by on a widespread level until Jeff Sackmann developed minorleaguesplits.com several years ago with scripts that data-mined play-by-play logs.
The state of public defensive data for minor leaguers is even worse, with the defensive stats provided no more advanced than MLB fielding stats in the 1950s. Sean Smith and I have systems that parse minor league play-by-play logs to get some rudimentary defensive data, and I’ve taken the step of text parsing keywords in scouting reports to get a nudge one way or the other, but this isn’t a substitute for better data.
Lesson #6: People Get Really Mad at Algorithms
No matter what you do, no matter how well you explain your projection system, no matter how clearly you lay out the basic design principles, people will get furious with your projections and accuse you of all sorts of biases and agendas in putting your projections together. I received my first death threat in 2005, someone telling me that they hope my house burns down (though I guess that can be more properly qualified as a death wish than a direct threat). It wasn’t my last one.
Explaining basic probability to a group of people can be best qualified as futile. For example, ZiPS projects very few batters to have a mean projection of a .300 batting average in a season. For 2014, only three hitters were projected to have a .300 BA or better as the mean outcome.
ZiPS actually projects there willd be 23 players (on average) that would end up at .300 or better, the difference being that we don’t expect everyone to play to their mean projection, we expect 10 percent of players to hit at a BA they only have a 10 percent chance of matching, and so on. Think you’re going to successfully explain this to a critic that may very well have never taken a probability class in high school or college? You’re not.
Lesson #7: Having a Programming Background is Very Useful
One of my regrets in developing a projection system is that, while I understand the underlying math and I have excellent skills in Excel and Statistica, my general programming skills are quite rudimentary. Like PECOTA was for the Baseball Prospectus crew, turning ZiPS into a program rather than a bunch of gigantic, interlocking spreadsheets would make things run far more smoothly. Unfortunately, I just took my required Computer Science courses in college and just did enough to get a passing grade.
If I had to do it over again, a better programming background and much deeper database knowledge would make a lot of ZiPS far more elegant and easier for me to implement new things. I’ve never even used R, which makes me kind of a weird old relic in the sabermetric community, despite the fact that I turn 36 in June.
Lesson #8: Results are “Stickier” In-Season than Season-to-Season
When developing in-season projections, it quickly was evident that the model that worked for season-to-season projections needed quite a bit of adjustment to work for in-season projections. Simply put, there was significantly less regression toward the mean for in-season stats than you would expect from the sample size, relative to season-to-season stats.
One notable example was BABIP, in that the BABIP overperformance, in the context of in-season projections, tended to stick more than one would expect from the heavier regression from season-to-season. That .400 first-half BABIP may be doomed next year, but players retain a surprisingly large amount of that bounce within the same season.
Lesson #9: Sometimes, Simple is Better
Going back to in-season projections for a minute, they provide a solid example of why, sometimes, simpler is better. While we strive to have our models as accurate as possible and our conclusions as precise as we can make them, that stance sometimes can get in the way of conveying information. That’s not a small issue, since something that’s 99 percent accurate and can’t be communicated to other people easily may not be as intelligent a choice as something that’s 98 percent accurate and people can easily understand.
My original model for in-season ZiPS was significantly more complicated than the one updated every morning on FanGraphs, and it’s slightly more accurate. (It’s still the one I use when I need to calculate in-season projections for a player or two rather than a large group.) The problem with the more complicated model is that it wasn’t one that could be updated easily or daily.
A simpler model, the one that’s updated every morning, isn’t quite as accurate as the more complex one, but it has the benefit of constant updates for every major leaguer in baseball. What’s the use of great data if isn’t accessible? This is a lesson that I also would write for 10 Lessons on Being a Nerdy Baseball Writer–for things like OPS, which I still use if it’s practical–but that’s a different article.
Lesson #10: Don’t Let the Projections Affect Your Rooting Interests
One of the toughest things about projections is walking away from them. When I’ve done all of the projections for the year, I’m done with them, no sweating over who’s exceeding their projections or falling short, a policy that took me several years to value completely. If you start following individual player lines obsessively and start rooting for specific players to match their projections rather than win or lose games, you’ll slowly drive yourself insane.
ZiPS projected Josh Hamilton to have quite a low projection last year, the lowest of any of the projection systems (and he underperformed even the modest ZiPS one). But there’s something almost soul-sapping about rooting for a player to play poorly so that you look smart rather than because you want your favorite team to defeat that player’s team.
You always validate your results at the end of the year, but until then, just enjoy the games. Baseball is fun, after all.