As promised here is the second and final installment of my interview with Greg Rybarczyk, proprietor of Hit Tracker, which is one of the great baseball websites. In case you are an irregular reader of The Hardball Times (do you have a job or something?) here is part 1, which mostly chuntered on about the the genesis of Hit Tracker, the physics model and accuracy limitations.
For the next 2,000 words or so Greg proffers his views on some of his observations from Hit Tracker and what we can look forward to in the future. After all, this is a gentleman who watched every home run hit in 2006. So, presumably he has witnessed one or two things of note.
That’s enough rambling—over to the interview proper.
John Beamer: So, in the course of your work you presumably watched every home run last year. Which were the most memorable and which were most surprising?
Greg Rybarczyk: Most memorable: Vladimir Guerrero‘s July 30 homer off Curt Schilling at Fenway Park that passed through the bank of lights in left field. Aside from the numbers (476 feet, sixth longest of the year, 122.3 mph off the bat), the sounds from that homer were fantastic! The hit itself was just the purest “crack,” simply beautiful, followed by 35,000 people simultaneously saying “OOOHHH!” and Jon Miller’s call was great as well: “Oh, man… hello, New Hampshire!”
Most surprising: Here I have a few: first, Reggie Abercrombie‘s 481-foot homer on April 19 at Great American Ball Park, his first career homer! What a way to announce your presence with authority. After that I was expecting better from him the rest of the season, but with 78 strikeouts in 255 at-bats, he has some work to do to put the bat on the ball more often.
Next, Andruw Jones‘ July 15 homer off Chan Ho Park at PETCO. This ball covered 463 feet, and is by my reckoning the longest homer in PETCO Park history, but it didn’t get a lot of press afterwards, though the announcers that night were suitably impressed.
Finally, Cody Ross‘ first of three homers on Sept. 11 at Dolphins Stadium, which reached the upper deck in left field and covered 460 feet. What a blast, from someone who is not at all renowned for long distance power; and why did he get the pitches to hit two more in that game?
JB: Yeah … I also noticed that Reggie Abercrombie smacked the longest homer at The Ted (standard distance). Amazing. You are obviously well placed to assess which players got lucky and which have genuine power (i.e., consistently pound long home runs). For instance, I was quite surprised that the average distance of Andruw Jones’ home runs did not place him in the top five for the Braves last year. With one year’s worth of data it is too early to say if that was unusual.
Anyway, onto the question … are there any players in particular that got exceedingly lucky with the long ball last year?
GR: I’ve created criteria for assigning all home runs (out of park) to one of three categories: “Just Enough,” which means they either cleared the fence by less than 10 feet in height, or they landed within one fence height of distance from the fence; “No Doubt,” which means they cleared the fence by at least 20 feet of height and landed at least 50 feet past the fence, and “Plenty,” which is anything in between.
I just finished classifying all the homers, and the data are absolutely full of great stuff, for instance:
- The league average is 27%/55%/18% for the categories JE/PL/ND.
- AL MVP Justin Morneau had a pretty weak profile, hitting 15/16/3. (Note: if ever the three types don’t add up to a hitter’s total, it is because a handful of homers were not observed.)
- Alfonso Soriano went 18/22/5.
- Bill Hall went 15/17/2.
- Nick Swisher 13/18/3.
On the strong side:
- Albert Pujols had a very strong profile, with 8/27/14.
- Adam Dunn also had a very strong profile, with 2/22/15.
- Travis Hafner went 7/23/12.
- Manny Ramirez 6/18/10.
- Ryan Howard 10/35/13
- Alex Rodriguez 3/23/9.
(Greg recently penned an article for THT on this very subject, which you can find here)
JB: Great stuff. What about hurlers—how did some of them fare?
My initial impression is that there are not so many extremes as there are with the hitters. In fact, all of my research is starting to make me believe that hitters are the predominant factor in home runs, and pitchers are not nearly as important. Anyway, here are the data:
Pitchers who didn’t get ripped:
- Ambiorix Burgos 5/11/0 (well, he did give up 16 homers in relief, most in MLB)
- Kyle Lohse 8/7/0
- Jamie Moyer 12/20/1
- Woody Williams 8/12/1
- Sean Marshall 10/9/1
JB: So what is your hypothesis as to how well these hurlers will do this year … do you reckon we’ll see a lot of regression to the mean?
GR: Any time you are considering something where the trials are independent, i.e. last year’s results do not have any influence over this year’s results, you can expect regression to the mean to take place. The likelihood that the 10 unluckiest pitchers from 2006 will also turn out to be the 10 unluckiest guys in 2007 is astronomically low, but on average, five of the 10 will again be unlucky to some degree.
However, there are lots of other factors affecting a player’s performance that may turn out to be much more significant than the luck factor: one player may master a new pitch and improve, while another suffers nagging injuries and gets hit harder, while still another loses some zip off his fastball or command of his breaking pitches. I regard the luck factor to be an adjustment (typically a minor one), applied in addition to all the other projection methods, not instead of them.
JB: In the THT 2007 Annual I was struck by Craig Biggio‘s home run distribution showing that all his long balls at Minute Maid were to the Crawford Boxes. Are there any other player (either hitter or pitcher) oddities that you have picked up in the course of your work?
GR: I did find some “oddities”, or at least some unexpected things in the year-end data. Many of these became part of my article in the 2007 THT Annual, “Which Way Did it Go?” For instance, Glendon Rusch suffered from extremely bad luck in that in 2006 he pitched with the wind blowing out far more frequently than expected, while Alex Rodriguez hit his homers further, on average, than anyone, but missed out on a lot of trips around the bases because he didn’t hit the ball to right field.
A few other items of interest:
- While most people would associate surrendering long home runs with being a bad pitcher, the two MLB pitchers whose home runs went the farthest, on average, were Roy Halladay at 404.9 feet and Curt Schilling at 403.4 feet.
- At the other end of the spectrum, Tom Glavine‘s 22 home runs allowed traveled only 378.3 feet on average, second-lowest in the majors. Glavine allowed 17 homers to LF, five homers to RF and zero homers to the middle 41 degrees of the field (out of a total of 90 degrees from foul line to foul line).
- Albert Pujols showed the ability to hit homers to all fields, but given a pitch he could handle, Pujols’ favorite destination was the LF corner, where he hit 29 of his 49 homers within 15 degrees of the left field line (or in other words, he hit just over half his homers to the left-most one-sixth of the field.) That would be sections 170-172 for those readers planning a trip to Busch Stadium this summer…
- A couple more extreme pull hitters were Frank Thomas (37 of 39 homers hit to the left of dead center field) and Alfonso Soriano (40 of 46). Makes you wonder why teams don’t shift their defenses more drastically when guys like this come up…
JB: I’ve got a reader question for you from Michael Brucker, who wanted to know if you had any plans to track some historic home runs?
GR: On the Hit Tracker site, I have created a section called “Highlight Homers,” where we take a closer look at homers that are notable for one reason or another. Often, these are historic homers, such as Mantle’s 1963 “facade” homer at Yankee Stadium, Ted Williams’ “Red Seat” homer at Fenway Park, or Reggie Jackson’s 1971 All-Star Game homer at Tiger Stadium.
I also try to include some interesting contemporary homers as well, such as Ryan Howard’s third-deck blast at Citizens Bank Park last year, or David Ortiz’ May 1, 2006 homer against the Yankees that cut through a 16 mph wind blowing straight in. For each of these homers and many more, I provide a narrative description of the homer, any assumptions I had to make for the analysis, and the results both in numbers and diagrams. I am always on the lookout for more highlight homers; any home run where we either have video, or can pinpoint the landing spot, is a candidate for analysis.
Highlight Homers has proven to be one of the most popular features of the site and is a great way for fans to get involved, as several of the homers have been suggested (and in some cases, researched) by site visitors. A great example of this is Brenton Blair, who analyzed Mark McGwire’s 512-foot homer at Jacobs Field in 1997. Brenton has since become one of my two outstanding volunteer spotters who, along with Brian O’Malley, is doing a great job helping me track batted balls in 2007.
JB: What are your short term goals for Hit Tracker?
GR: The short term goal for 2007, stated simply, is to demonstrate the value of Hit Tracker when it is applied to all batted balls, rather than just for homers.
To that end, I have enlisted the aid of three spotters (Brenton Blair, Brian O’Malley and Mike Newcomer) who are going to contribute their time and effort to help significantly increase the number of observations we make in 2007 above Hit Tracker’s inaugural year of 2006. In addition to all homers, this year we will be observing and analyzing a subset of all fly balls that land “near” the outfield fence, as well as all batted balls of any type for selected games, in support of some specific analysis projects.
The long fly balls are being covered to allow analysis of power hitting, particularly the impact of park configurations and weather on power statistics. The complete analysis of all batted balls should allow myriad different analyses in the areas of hitting, pitching and fielding; it will be done on a limited basis due to the sheer amount of time required to make the observations. Obviously, more volunteers would mean more data for us all to analyze, so hopefully some more capable contributors will step forward this summer and help us out.
One other goal for 2007 is to continue to demonstrate the superiority of Hit Tracker over the various systems for home run distance estimation that are currently in use around MLB, in the hope that I can convince MLB and its teams to implement the Hit Tracker system.
Hit Tracker is more accurate, it provides additional information (e.g. Speed Off Bat), it incorporates the impact of wind, temperature and altitude, and perhaps most importantly, its distance estimates are accompanied by the entire trajectory of the ball in flight, so its output has more credibility than the single, disembodied number that comes out of the existing systems (which have been roundly criticized, even by the ones who use them, such as Leigh Tobin, the Philly Media relations director, who last year referred to her home run chart as “the bane of my existence”).
Hit Tracker can also be implemented in any ballpark from A-ball through MLB to provide data at all levels, something that will never be the case for any high-cost camera-based system that might currently be under consideration for national games of the week and the postseason.
JB: What you ultimately see Hit Tracker becoming (i.e., what is the end game)?
GR: Ultimately, I see Hit Tracker being used to track every batted ball in MLB games. What the exact end game for Hit Tracker will be I can’t really predict; this will depend on the perceived value of the data by the various types of customers (MLB teams, media outlets, fantasy baseball players, etc). Casual fans (in the ballpark and watching on TV/listening on radio) will want to know exactly how far a homer traveled, how hard a home run was hit and how much help it got from the wind, something they can only know from Hit Tracker, so there is the potential to license the Hit Tracker tool and method for this.
MLB teams and fantasy baseball players will want to know how a player’s hits (or a pitcher’s allowed hits) translate from team to team and park to park, something that currently can only be done today (poorly) with the decidedly blunt instrument of park factors. Hard-core analysts will love Hit Tracker’s ability to generate a probability density function for a season’s worth of hits, with predicted outcomes in any ballpark, starting with only a limited sampling of a rookie’s batted balls.
Hit Tracker will also provide the most detailed data yet for analysis of defense by liberating analysts from the limitations of batted ball “categories” such as liners, fliners and fly balls, or “hard,” “soft” or “medium” grounders. So there is plenty of potential here to provide advanced data on a subscription or custom report basis while still making much of the data available for free.
JB: Will it be possible to fully automate Hit Tracker at some point to reduce the need for the phenomenal number of man hours you put in?
GR: As for automating the observations in Hit Tracker, I am not optimistic that it will be possible to do this with an acceptable degree of accuracy any time soon. It would take a lot of cameras and equipment, an extremely capable intelligent video application (in order to be able to capture 100% of all batted balls), and enough money to put systems in 30 MLB parks (to provide full context for the data). In the short term it will remain more economical to observe the existing video streams, or to locate observers in all the parks.
That’s it folks. I just want to say a big thank you to Greg for spending the time talking to me. Be sure to check Hit Tracker regularly throughout the year—it is an invaluable resource for all baseball fans.
References & Resources
A big thanks to Greg for doing this interview and also for providing a legendary source for tracking home run distances. Fantastic stuff.