|
May 26, 2012
THT Essentials: Now AvailableThe Hardball Times Baseball Annual 2012, an annual "must buy" for all baseball fans, is now shipping. Read this article to learn more about it.
THT's latest bookThird Base: The Crossroads is THT's new e-book, available for $3.99 from the Kindle store. The good news is that anyone can read a Kindle book, even on a PC. So enjoy the best from THT in a new format.Most Recent Comments
A splitter from Buchholz (1)
A short story about two sinkers (4) Umpire statistics (1) Similarity Scores: a very beta feature (12) Things are trending up (3) ![]()
Lucas Apostoleris
Rich Barbieri John Barten Brian Borawski Vince Caramela Chris Jaffe Brad Johnson Mat Kovach Kevin Lai Myron Logan Chris Lund Bruce Markusen Jeff Moore Troy Patterson Harry Pavlidis Dave Studeman Steve Treder David Wade And here's the full roster. Dish TV Packages options for all televised baseball games.
Or you can search by:
![]() All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License. Part of the USA Today Sports Media Group |
![]()
Wednesday, March 28, 2012Things are trending upAbout a year ago, sometime during John Lackey’s precipitous decline into worthlessness, I started paying attention to an odd phenomenon in which he would lose velocity during games: ![]() Here, we can see a clear slope in fastballs thrown over the course of the game. It’s as if there’s some direct, measurable relationship between how many pitches he’s thrown and what the speed of the next pitch will be. The same is probably true of the breaking pitch, but it’s more difficult to see because the pitch mix (and speeds) are more variable. Of course, people who look at PITCHf/x data will say that this is nothing particularly new—lots of pitchers do this—and they’d be right. But I wasn’t aware of any study describing this phenomenon in any detail. So, what I did was write a quick script that went through my database and found pitchers who had thrown at least 30 fastballs (sinkers and cutters included) in a game and had thrown 10 or more starts in either 2010 and 2011. It then fit a simple line to fastball speed by number of fastballs thrown to output a “fastball slope.” It turns out John Lackey isn’t even the worst offender. For example, here’s a “.10” game by Jonathan Sanchez (who is the weakest link): ![]() The “.10” here means that for every 10 fastballs, he loses a mph on his fastball over the course of the game. “.10” isn’t as bad as it gets, though, Jonathan Sanchez is just the worst on average. Here’s a staggering “-.19” game: ![]() Tommy Hunter starts the day blowin’ em away at nearly 96... and then barely tops 90 by the end! Of course, the article is titled “Things are Trending Up.” How about games in which the opposite is true? For that, we need to turn to a subset of pitchers who actually gain speed over the course of their outing. Like, Justin Verlander, who does this with some sort of regularity. Here’s a “.14” game from Verlander last year: ![]() Sure he’s got one mighty fastball in the early innings, but for the most part, he really gets cooking quite a bit later in the game, topping 100 several times. You might ask yourself if this is a trait or just an oddity of strange games. So, I took those scores, averaged them, and then split them by year. Here’s what I found: ![]() Each dot on this graph is a different pitcher over these two years. There’s a clear relationship between 2010 and 2011, suggesting that how a pitcher changes over the course of a game is a stable trait, likely a result of mechanics or physical attributes intrinsic to each athlete. Friday, March 23, 2012Similarity Scores: a very beta featureLast night during the heart-stopping Syracuse-Wisconsin game, Harry and I were talking (okay, Harry was mostly watching his team win by the skin of its teeth) about ways to improve the Brooks Baseball player card system. We exchanged some data and are presenting the first of our "data-driven" search tools—pitcher similarity. This feature is incredibly beta and likely to change over the next few weeks, but right now when you search for a player (let’s pick Josh Beckett), you will get a table listing other players in the "Josh Beckett Family," along with the "distance" to each player. The scores are generated by comparing a vector of pitch speed, frequency, release, spin angle, and spin rate using MATLAB’s knnsearch algorithm to identify neighbors. Currently, we’re presenting the top five neighbors for each pitcher. These are not perfect right now. We haven’t weighted the scores yet (that’s another conversation over basketball), so while we do a good job representing pitch mix and style, we’re not doing a very good capturing pitch speed yet. There are also a few pitchers with hardly any comparables. Matt (@HouseOfTheBB) noted that neither Roy Halladay nor Mariano Rivera have a comparable pitcher at all! Punch in a few pitchers, and let us know how our system is doing. Let us know over Twitter if we’ve really missed on someone. I'm @brooksbaseball, and Harry is @harrypav. Friday, March 16, 2012Danny Duffy PITCHf/xHere's something new. Jeffrey Gross from THT Fantasy and I are exchanging "watch this guy" ideas—fantasy picks from one side and PITCHf/x based picks on the other. First up is Jeff's first breakout candidate for a cheap but valuable pitcher—Kansas City's Danny Duffy. The PITCHf/x data we're discussing can be seen on Duffy's player card at Brooks Baseball. The data is from MLB Advanced Media ('BAM) but the pitch classifications are our own. Duffy pitched in the majors from May to September in 2011, ending his season after one start in that final month. It was his first year in the show, although there are some PITCHf/x data points from the 2009 Futures Games, the Arizona Fall League in 2010 and some Cactus League action from 2011. Duffy's primary pitch is his four-seam fastball, followed by a curveball, change-up and the occasional two-seam fastball and slider. Horizontal spin movement and vertical spin movement+gravity
shown from catcher's perspective The black blobs at the top are four-seam fastballs, straightest flight and the backspin keeps gravity at bay fairly well. He's got a good hopping fastball, generates pop-ups and fly balls. To the right of the fastballs, running down and away from right-handed batters and rarely thrown to left-handed batters, are his two-seam fastballs or sinkers if you prefer. The blue dot change-ups have a similar tail to the sinkers but their reduced speed gives them separation vertically. The yellow curveballs are a group of over-the-toppers with a red cherry of the occasional slider on top. Fastball (FA) 94 mph, 62% usage Curveball (CU) 76, 19% Change-up (CH) 84, 14% Sinker (SI) 93, 3% Slider (SL) 81, 2% Duffy's fastball is above average in whiff rate, pop-up rate and flyball rate compared to other fastballs. His change-up is mostly average but his curveball doesn't miss too many bats but yields plenty of ground balls. At least in 2011, that is. He's yet to develop a swing-and-miss secondary pitch, and that's going to hold him back. You can't pitch on fastballs alone in a big league rotation. Or not for long. If he's going to rely on his impressive power, he would benefit in the long run by further develop his two-seamer, which was ineffective in yielding worm killers in its limited use. With his high arm slot, a true sinker isn't likely to emerge. He may get more ground balls out of a cut fastball if he were to develop one. Some trends to note before we hand the baton to Jeff for his Fantasy perspectives ... caveats are the small sample size and the PITCHf/x data from Kansas City tending to yield higher pitch speeds than most other parks.
Jeff's thought was 2012 could be a breakout year for Duffy. I'd watch him with an eye on 2013. Thursday, March 08, 2012Darvish will make pitch classification funYour definition of fun may vary. But Yu Darvish and his eight-pitch mix are going to make life interesting for catchers, hitters and even PITCHf/x analysts. Here's a picture from his Cactus League debut. The pitch in red was a strike three splitter to end an inning. The axes show movement during the flight to home plate from the catcher's perspective. There's an 86 mph cutter that's got less drop and more hook than the other, faster, cutters. What's up with that? Looks like two change-ups on the left side of the change-up and splitter group. And you can clearly see the slow curveball. This was a situation where game video and post-game interviews helped out. Dan Brooks looked at my original rough classifications and suggested some improvements. You can see those on Darvish's player card. The charts include his World Baseball Classic appearances, but you can filter the tables by year. We plan on providing yearly movement graphs in our next update to the site. Z-Scores and Pitch IQ ScoresI thought it important to describe a new feature we've added to the PitchFX Player Cards over the last month or so. I’ve previously tweeted (@Brooksbaseball) about these features but haven’t described them in detail. When the cards first debuted, we were asked by a number of people to provide average data for comparison, especially for the "Sabermetric Outcomes" table. For example, if Clay Buchholz got 45.96 percent whiffs/swing on his change-up, people wanted to know how good that was relative to other pitchers, and so they wanted some average number of swings and misses. They had a feeling it was good, but they wanted to know just how good. The problem people don’t realize is that they really don’t want the average, because while it is useful in some contexts to know simply an average, it isn’t nearly as useful as knowing something about the distribution of scores. For example, if I told you that on some made-up metric Buchholz was a 7, and that the average was a 5, you’d know that Buchholz was above average but you wouldn’t know by how much. Maybe on this metric most good players score between 5-6, and so 7 is really outstanding. Maybe on this metric most good players score between 5-25, and so 7 is really not very exceptional. So you can see, it would be nicer if I told you instead something about how far Buchholz was from the mean score as an expression of the distribution of scores. For that, we can use a Z-score. The Z-Score is a simple concept in statistics. Simply, it tells you how many standard deviations a score is from the mean score. So, if you now scroll down to the “Sabermetric Outcomes” table on Buchholz’s player card, you can change “Percentages” to Z-Scores. This will contextualize the percentages that you see on the table (all of them, not just whiffs) so that you can better understand just how good the pitches are that you’re looking at. ![]() It’s also important to think about which distribution is appropriate for comparison in this case. For example, we could compare Buchholz’s change-up to all other pitches, or, perhaps more appropriately in this context, compare the change-up to all other change-ups. We’ve chosen “all other change-ups” as our distribution. When you change the months or years on the player cards, it will change the numbers used for Buchholz’s data but won’t change the numbers used for calculating the Z-scores, because we didn’t want to get too fine with the comparisons that we made. There’s also a problem in this dataset that arises when pitchers throw a very small number of pitches, because this makes their whiff numbers artificially high or low (luck plays a larger role). So, we’ve left pitches (e.g., Player X’s change-up) out of the distribution that didn’t get thrown at least 100 times. You can still look at those pitches as a function of Z-score, but they won’t be particularly meaningful. Even with these omissions, we’ve still got a very large sample to work with for each pitch in this case (except Knuckleball, which is a special case on its own). You can also change the scores into “Pitch IQ Scores.” You can think of the “Pitch IQ Scores” exactly like you would think of Z scores, except, some people don’t like Z scores because it requires explaining basic statistics to people, and IQ scores are a sort of intuitive thing that we use in everyday life. The formula here is simply 100+15*Z (just like it is for IQ). We also often use the 100+/- system in baseball, for describing things like ERA+ or OPS+, so describing a pitch as having a 124 Whiff/Swing (pretty damn good) or a 64 Whiff/Swing (worse than useless) seems like something that could catch on, and might be easier for your readers to grok than numbers like “1.6” and “-2.4”. ![]() Have fun, tab through, figure out which representation of the data you like best. I hope you enjoy the new features and we look forward to hearing additional feedback as the season begins. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||