Web based PITCHf/x toolby Josh Kalk
November 14, 2007
The PITCHf/x data is a gold mine of information but sadly only a few people have been doing a lot of research with it. The main reason for this is it is very complicated to go to MLB's servers and download the data and then put it into a form that is usable. So what I would like to do is to create a tool that will allow anyone to access the data. This tool is still far from complete but it finally is now at a point where it is fast enough and powerful enough to release. Consider this version 1.0 with more to come.
Basically, what I have done is stored the 300,000+ pitches tracked by PITCHf/x into a database and added a simple form for users to query and then look at the location of where the selected pitches crossed home plate. Note that the perspective is from the catcher (or umpire) so a negative horizontal value is closer to a right handed batter. Currently, you can choose any pitcher or batter who has either thrown or been at the plate for 50 tracked pitches. You can also select the type of pitch thrown (e.g. Fastball, Curve, Splitter etc...) or any combination of the three as long as either a pitcher or batter is selected. This means you can look at all of the Curves Barry Zito has thrown, or all the change ups Geoff Jenkins has flailed at, or match ups like Brad Penny against Barry Bonds, or whatever you would like.
A few things to note. First, I have zoomed very close in on the strike zone so you have a better view of where the action is. That said, some pitches are thrown far away from the strike zone so you might be missing a few. Second, the implementation of PITCHf/x happened in fits and spurts, and in fact Baltimore never played a home game with PITCHf/x turned on. Because of this, your favorite player might have missed the 50-pitch cutoff or maybe only has a fraction of his true production this year. This is especially true when looking at matchups like Red Sox versus Yankees. Both of their parks were late to have PITCHf/x installed, so even though they played frequently, there is little data from those games. If you make a selection that has no pitches that fit your criteria, the tool will prompt you to make another selection. Third, while the querying of the database is now happening in less than a second it still takes a few seconds to draw the image. From my testing it usually has the image up in less than five seconds, but that is with just me querying it. If this tool becomes popular it may take some time to fulfill your request. Lastly, the tool is using the pitch type classifications I have developed which generally are correct but there might be some issues especially with pitchers who just reached the 50-pitch mark. If you think a pitcher throws a certain pitch but the tool tells you there aren't any pitches of that type you probably have found a mistake in my classification algorithm. You can check what pitches the algorithm thinks the pitcher throws by looking at his player card over at my blog. If you spot a mistake please comment at the ball hype link at the bottom of the page. It may take a little while for me to fix but when the second version of the tool comes out it will be updated.
I also want to take a few minutes to share where the tool is headed and things that didn't make it into this version but will be in the next version. The results page will eventually give you an opportunity to re-query the database without having to press the back button. A table with the results of the pitches will be added to the top of the page. An option will allow the user to graphically look at the type of pitch instead of the result. An option will allow the user to look at the break of the ball instead of the final location. An option will allow the user to combine groups of pitches together if there is a lot of data (even with the zoomed in look some plots are still too busy). You will be able to query more variables such as the count or a certain result like all the home runs hit. Hopefully all of these features will make the next release. If there is something else you would like to see added again, please comment at the ball hype link.
If you are interested in copying one of the plots you create and posting it on your blog or website that is fine but please add a reference link back to this page.
Without further ado, here is the link to the PITCHf/x tool or you can access the tool below.