Introducing the IPORTby Matthew Carruth
February 19, 2007
Individual Pitch and Outcome Result Table (IPORT)
Almost all research begins with a question in mind. The IPORT is no exception, and for that I have Joel Pineiro and Jeff Sullivan of Lookout Landing to thank. It was Sullivan who made a note to point out how few swinging strikes Pineiro garnered in a typical 2006 outing. It was apparent that Pineiro was mediocre in this regard, but how far below average? What was average? We did not have a context to compare this against. It was that need that prompted the research needed to compile these reports.
Rather than just sticking this at the end of the article, I want to take the time now up front to thank Retrosheet for their terrific work in making this data publicly available.
Now with that out of the way, onto the table itself. For formatting purposes, the table will be presented in three parts, as shown in the following examples:
Part 1: General Info Last First Throws Year Cat1 Cat2 Pitches BF Cornejo Nate R NA NA NA 5132 1420 Part 2: Individual Pitch Ball Foul Swinging Called LD% GB% FB% PopUp 40.08% 14.36% 4.48% 17.95% 2.88% 11.30% 5.36% 1.13% Part 3: Outcome Result K% BB% IBB% HR% HBP% BABIP 7.18% 6.76% 0.92% 2.68% 0.63% 0.3182 Table Key: Last – Pitcher’s last name First – Pitcher’s first name Throws – What hand the pitcher throws with Year – Which season the data was drawn from Cat1 – Optional category for further sorting Cat2 – Optional category for further sorting Pitches – Number of pitches thrown in dataset BF – Number of batters faced in dataset Ball – Percentage of pitches thrown in dataset resulting in a ball Foul – Percentage of pitches thrown in dataset resulting in a foul ball Swinging – Percentage of pitches thrown in dataset that were swung at and missed Called – Percentage of pitches thrown in dataset not swung at and called a strike LD% – Percentage of pitches thrown in dataset resulting in a ball in play of type line drive GB% - Percentage of pitches thrown in dataset resulting in a ball in play of type ground ball FB% - Percentage of pitches thrown in dataset resulting in a ball in play of type fly ball PopUp – Percentage of pitches thrown in dataset resulting in a ball in play of type pop up K% - Percentage of plate appearances in data resulting in a strikeout BB% - Percentage of plate appearances in data resulting in a walk IBB% - Percentage of plate appearances in data resulting in an intentional walk HR% - Percentage of plate appearances in data resulting in a home run HBP% - Percentage of plate appearances in data resulting in a hit-by-pitch BABIP – Batting average on all balls hit into play
Part 1 gives us the basic information surrounding the dataset that we are looking at. In the case above we are looking at Nate Cornejo's numbers aggregated over his entire career, hence the NA in the year column. NA in both Cat columns states that we are doing no further parsing of the data. In the future, we will look at splits in the data and these columns will become relevant.
Part 2 breaks down every pitch thrown into one of eight categories: ball, foul, swinging strike, called strike, ground ball, fly ball, line drive or pop up. Every pitch can be categorized into one of those eight bins. Not receiving their own bin are events like hit-by-pitch, wild pitch, passed ball, etc. In those cases, they are simply marked as balls, since without the hitter getting in the way, that is certainly how they would have ended up.
It is important that we cover all possible outcomes, so that not only can we examine relative rates like say groundball over flyball ratio, but we also can get an idea how often a ground ball occurs on a pitch ,compared to say, a foul ball. By adding up the final four (ground balls, fly balls, line drives and pop ups) bins, we can arrive at how often a pitch is put into the field of play. This individual pitch breakdown also allows us insight into how each pitcher is generating their strikes, and also gives us a direct percentage on ground balls, not just in comparison to fly balls, but to line drives and pop ups as well.
For instance, Cornejo is not just a random selection to display. He happens to be the pitcher with the lowest percentage of swinging strikes in all of baseball (minimum 1,000 batters faced) from 1988 through 2006 (excluding 1999). Brad Lidge had the most, and it was not particularly close. Lidge was nearly two full points above second-place Rob Dibble, who himself was a half point above the next best, Robb Nen. The best starting pitcher at generating swings and misses was Nolan Ryan followed closely by Pedro Martinez. Hmm ... there might be something to this swinging strike theory.
Want to know which pitcher generated a ground ball more often than anyone else? Chien-Ming Wang (surprise!). Who is the most stingy pitcher at allowing fly balls (remember, ground balls and fly balls are not either-or here)? Chad Bradford, another shocker. Lowest BABIP? Troy Percival. Highest BABIP? Well, that list is stocked with a lot of pitchers who did not range out of small samples. Glendon Rusch has the highest among pitchers who faced more than 2,000 batters at .324. Stingiest with the walk? The amazing Dennis Eckersley, at just 2.46%. Steve Searcy has the most batters faced (864) without hitting a single one. The amount of interesting rankings is cavernous.
Part 3 looks at what happened on a plate appearance level as opposed to the individual pitch level of Part 2. Explicit rates are given for the five possible defense-independent outcomes (strikeout, walk, intentional walk, hit-by-pitch and home run allowed), and the pitcher's BABIP is also given to give an idea about what is happening on all other outcomes. By adding up the five defense-independent rates, we arrive at a figure detailing just how much each pitcher relies on their defense to generate outs, ranging from Jeff Ballard, who had 84.29% of all plate appearances end with a ball in play, to Lidge, who had just 55.69% of plate appearances end with a ball in play.
The chief benefit of this part of the IPORT is to present a vastly better idea of what the pitcher is doing over just looking at strikeout, walk, and home run rates. For one thing, said rates are represented in terms of percentage of batters faced rather than number per nine innings thrown, which gives a more intuitive understanding for how often a pitcher strikes out or walks a batter. Secondly, we separate out intentional versus non-intentional walks and include hit-by-pitch as well, covering all our bases, so to speak, when it comes to defensive-independent statistics. Combined with BABIP, you can get a pretty good idea of what the pitcher’s total outcomes were.
References and Resources
the information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".
Matthew Carruth is an editor for The Hardball Times. He welcomes any and all sorts of communication at his email.