The "World Champ of Baseball Annuals"![]() Derek Ambrosino
John Burnson Derek Carty Marco Fujimoto Eriq Gardner Matt Hagen Jonathan Halket Rob McQuown Troy Patterson Mike Silver Paul Singman Michael Street And here's the full roster. Got a question for our fantasy baseball experts? Email us:
![]() Plus our Statistical Definitions Most Recent Comments
Introducing THT Forecasts (6)
My draft (7) Waiver Wire Offseason: NL (4) LABR NL 2010: Team Carty (18) Waiver Wire Offseason: AL (7) Monthly Archives
March, 2010
February, 2010 January, 2010 December, 2009 November, 2009 October, 2009 September, 2009 August, 2009 July, 2009 June, 2009 May, 2009 April, 2009 March, 2009 February, 2009 January, 2009 December, 2008 November, 2008 October, 2008 September, 2008 August, 2008 July, 2008 June, 2008 May, 2008 April, 2008 March, 2008 February, 2008 January, 2008 December, 2007 November, 2007 October, 2007 September, 2007 August, 2007 July, 2007 June, 2007 May, 2007 Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets. Find premium Chicago Cubs tickets and other Chicago tickets at JustGreatTickets.com. Chicago Cubs Tickets Chicago Tickets ![]() All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License. |
Wednesday, July 29, 2009The control hitters have over everythingPosted by Derek Carty at 1:08amA couple weeks ago, I wrote an article titled "The control hitters have over LD%," examining why it's a bad idea to use single-year line drive rates in any discussion of a hitter's underlying skills. Afterward, I received an e-mail from a reader who wanted me to go a step further: Hi Derek, With that, here we go... The resultsAs I said last time, this is far, far from a comprehensive study. For comparative purposes, though, it can be quite useful. Anyway, I looked at all hitters from 2004 through 2008 who amassed at least 350 at-bats in adjacent seasons (and played on the same team both years, to eliminate some park-to-park biases). What you're seeing is the R-squared results for each stat, which essentially tells us how much of the variation in Year 2 can be explained by the Year 1 figure. +---------------------------+------+ | STAT | R2 | +---------------------------+------+ | Batting Average | 0.18 | | On-Base Percentage | 0.36 | | Slugging Percentage | 0.37 | | OPS | 0.35 | | ISO Power | 0.52 | | ISO Discipline | 0.60 | | Batting Average with RISP | 0.06 | +---------------------------+------+ | Contact (K) Rate | 0.76 | | Walk Rate | 0.61 | | HBP Rate | 0.37 | | Pitches per PA | 0.61 | +---------------------------+------+ | BABIP | 0.15 | | 1B per BIP | 0.21 | | 2B per BIP | 0.16 | | 3B per BIP | 0.26 | | AB/HR | 0.42 | | HR/FB | 0.59 | | GIDP Rate | 0.13 | +---------------------------+------+ | LD% | 0.09 | | GB% | 0.60 | | OF FB% | 0.52 | | IF FB% | 0.43 | +---------------------------+------+ | SBO% | 0.33 | | SBA% | 0.80 | | SB% | 0.10 | +---------------------------+------+ Quick takeawaysAs we always stress here at THT Fantasy, stats like batting average and BABIP are poor indicators of a player's actual skill. It's much better to focus on component skills like contact rate, which is one of the most stable stats around. Home runs are relatively stable, which might surprise some but really shouldn't—after all, Juan Pierre isn't going to start posting 30-home run seasons, nor is Ryan Howard going to hit only five home runs. As we saw last time, line drive rate is very unstable, while the other batted ball stats are much more stable. And for those who like to blame hitters for being "unclutch" with runners in scoring position (I hear far too much of this from fellow Mets fans), check out no. 7 on the list. Quick glossaryEDIT: I'm adding this late per request. Sorry for some things being a little unclear to begin with. ISO Power: SLG-AVG ISO Discipline: OBP-AVG Contact (K) Rate: Contact rate on a per AB basis (not a per pitch basis). Calculated as (AB-K)/AB HR/FB: Home runs per outfield fly ball GIDP Rate: GIDP/BIP LD%: Line drives as a percentage of all non-bunt balls in play GB%: Groundballs as a percentage of all non-bunt balls in play OF FB%: Outfield flies as a percentage of all non-bunt balls in play IF FB%: Infield flies as a percentage of all non-bunt balls in play SBO%: Stolen base opportunity rate. The percentage of times a hitter reaches first and thus is in position to attempt a steal. Calculated as (1B+BB+HBP-IBB)/TPA. SBA%: Stolen base attempt rate. The percentage of times a hitter attempts a steal given that he is on first base. Calculated as (SB+CS)/(1B+BB+HBP-IBB). SB%: Stolen base success rate. The percentage of times a hitter is successful on a steal attempt. Calculated as SB/(SB+CS). Concluding thoughtsThat's all for today. Any questions, feel free to comment or e-mail me! Derek Carty is a 22-year old fantasy baseball analyst residing in New Jersey. In addition to writing for THTF, his work has appeared at Rotoworld (NBC), Sports Illustrated, FOX Sports, and Heater Magazine. In his two years competing in expert leagues, he has won 2 titles with 4 top three finishes, including a LABR NL title in 2009, making him the youngest person to ever win a major expert league title. Derek is a proud graduate of the MLB Scouting Bureau's Scout Development Program and is a firm believer in the importance of combining stats and scouting. He welcomes questions via e-mail.
The Real Neal said...
“What you’re seeing is the R-squared results for each stat, which essentially tells us how much of the variation in Year 2 can be explained by the Year 1 figure.” Huh? I am sure you’ve done some nice math here, but that sentence makes no sense. Let me give you an concrete example to illustrate. Year BA What you’re seeing is the R-squared results for each stat, which essentially tells us how much of the .024 can be explained by the .278. Posted 07/29 at 09:45 AM
Dave Studeman said...
I’m not sure what your example means, but the R squared measures how much of the variation among all players in Year 2 can be attributed to the variation among those same players in Year 1. Posted 07/29 at 09:54 AM
Seth said...
Brilliant idea for a piece. When doing research for my fantasy team next season, I will be sure to look up guys with high contact rates who have underachieved this season…could be another article even. Posted 07/29 at 10:57 AM
ThankYouMichaelLewis said...
I’m new to THT and it is fantastic, so bear with me if I can’t make as sophisticated inferences. If LD% is so unstable, yet is has one of the strongest correlations with batting average/offensive succes (retrofitted), then is it the secret weapon in fantasy baseball drafting/projections? In other words, if we see a player far off the mean LD% of 19%, could that be used as a primary indication as to how the player will perform the following season? It’s almost as if it’s an anti-correlation in that it can be used to project performance in Year 2 if Year 1 is an outlier. Thanks in advance for any clarifications. Note: I’m not even a fantasy baseball player, but I figured it was an easy example of putting future projections in use. Posted 07/29 at 11:20 AM
Detroit Michael said...
“Pitchers per PA” is close to 1.0 for everyone in the league. I would guess that Batting Average with RISP appears to be more unstable from year to year than just Batting Average simply because the sample size, the number of PA we are using, for each season is smaller. Posted 07/29 at 11:27 AM
Derek Carty said...
Sorry for the confusion, Dave. I added a quick glossary. As to all the other studies, I’m sure there have been loads of them, so I knew I’d miss a whole bunch if I tried (if you have some links handy, though, I’d be happy to add them). This isn’t anything new, just a quick reference for the readers who were looking for one. The Real Neal, Thanks, Seth Posted 07/29 at 12:24 PM
David Rasmussen said...
On statistics that are more luck based than skill based (low R-sq), like BABIP or LD%, the way to use them predictively is as follows. Someone has a high BABIP? His batting average for the rest of the year will likely decrease. Likewise, if someone has a high LD%, assume their rate stats will will go down. If you are interested in an individual player, compare BABIP and LD% to previous years to learn whether what they are doing may be sustainable. Example: Posted 07/29 at 12:24 PM
Derek Carty said...
ThankYouMichaelLewis, As to this specific question, David Rasmussen pretty much nailed it. LD% is a big driver of BABIP, but because it is so unstable, a LD% too far from league average is likely just good/bad luck itself. While it tells us *something* about the hitter, if we were to try to predict his LD%, we’d need to include a heavy proportion of league average, so a guy like Bartlett’s projected LD% going forward might only be 20-21% or so. We do need to note, though, that for pitchers, BABIP will generally regress to .300. For hitters, everyone regresses to their own unique number (not necessarily .300!), so things become a little trickier to analyze. This is a very important point to remember that many analysts still don’t understand. Posted 07/29 at 12:31 PM
Derek Carty said...
Detroit Michael, You’re absolutely right on BA with RISP as well. If we’re looking at players with 350 ABs for the year, they might only have 150 ABs or so with RISP, so the number is much more unstable. If we were to look at all batters with exactly 350 regular ABs and all batters with exactly 350 ABs with RISP (given a large, fictional, perfectly-constructed-for-our-needs-data set), the correlations would probably be almost identical. Posted 07/29 at 12:34 PM
Jonathan said...
Derek - Posted 07/29 at 12:43 PM
Dave Studeman said...
I guess I’d make a few points here. One is that there are many ways to calculate something like this, as Jonathan pointed out. In the 2007 THT Annual (which you can read for free at Wowio), David Gassko used a binomial correlation in addition to the year-to-year correlation and found a higher figure (.32 vs .13 for line drives, for instance, which is what he and JC got from year-to-year correlation). Over a career, or a “significant” amount of time, you will find differences between batters. Freddy Sanchez is a line drive hitter. Jason Giambi isn’t. That’s obvious, but it’s worth repeating I think. Lastly, remember this analysis (and virtually all analyses like it) have been conducted for established major league players by necessity. They’re the ones we have the data for. If you were to expand the sample to include minor leaguers, or players with cups of coffee, you’d find that line drive hitting (and virtually all the other measures) are more predictable than these results indicate. Posted 07/29 at 12:55 PM
Derek Carty said...
Yeah, Jonathan, as I said, there are much better ways to do this sort of thing. This is far, far from perfect or comprehensive or flawless. All this is is a simple reference guide for those who haven’t seen anything like this yet. There are definitely flaws, but I’m wasn’t looking to be super precise. For comparative purposes, all I’m trying to do here is say “BA is unstable, contact rate is stable. BABIP is unstable, HRs are somewhat stable. LD% is unstable, GB% is stable. etc, etc.” The results this produces are roughly in line with what we get from a more complex study, which suits what I was going for. Posted 07/29 at 12:56 PM
ThankYouMichaelLewis said...
Thank you Dave and Derek. When defining an offensive player’s lucky season, am I correct in assuming that LD% is the single biggest determinant, since it is what causes an abnormally inflated/deflated BABIP? Also, what about defensive luck for positional players? I ask this because I still have a hard time with UZR due to it’s annual fluctuation Could a pitcher’s unusually high LD% or BABIP cause a fielder to have a signifiantly lower UZR? I think I’m mostly hung up on UZR because a guy like Teixeira grades negatively, yet I see him make game-saving plays every single night (but that’s for another article). Posted 07/29 at 12:57 PM
Colin Wyers said...
It should be remembered that all of these correlations are artificially high due to the 350 AB cutoffs used - that substantially reduces the variance and therefor increases the correlation. This is why a weighted correlation is preferable. Posted 07/29 at 01:27 PM
Jonathan said...
Gotcha on keeping things simple. I’d probably just report the coefficient on the lagged variable. Under the same assumptions you’re using, it would be just as informative. Under less restrictive assumptions, it would be more informative. Of course, your articles are in any case also extremely informative. Posted 07/29 at 01:47 PM
Page 1 of 1
Commenting is not available in this weblog entry.
Next Post: Former top pitching prospects revisited>> <<Previous Post: Fantasy Fallout: Garko heads to San Francisco |
Are these stats defined anywhere? For instance, is OF FB% a percentage of all balls hit that are outfield flies, or a percentage of fly balls that are outfield flies? And what is SBO% and the other SB stats?
Last point: it would be nice to see references and comparisons with the many other studies of this that have been done in the past.