Order NowThe Hardball Times Baseball Annual 2010 is now in development and will ship in mid November! This year's book will feature articles by THT's staff as well as Bill James, Rob Neyer, Tom Tango and Craig Wright. If you use this link to purchase the Annual, you will be in the first group to receive it and you'll be supporting THT. ![]() Derek Ambrosino
John Burnson Derek Carty Marco Fujimoto Eriq Gardner Matt Hagen Jonathan Halket Rob McQuown Troy Patterson Mike Silver Paul Singman Michael Street And here's the full roster. Got a question for our fantasy baseball experts? Email us:
Heater MagazineAdd 10 MPH to your fantasy team — see for yourself
HEATER MAGAZINE Winner, 2008 CBS Sportsline Fantasy League of Experts ![]() Plus our Statistical Definitions Most Recent Comments
Waiver Wire Offseason: NL (4)
Approaching unconscious competence (25) Waiver Wire Offseason: AL (6) Waiver Wire Offseason: AL (5) Top 10 prospects for 2010: Tampa Bay Rays and Baltimore Orioles (5) Monthly Archives
November, 2009
October, 2009 September, 2009 August, 2009 July, 2009 June, 2009 May, 2009 April, 2009 March, 2009 February, 2009 January, 2009 December, 2008 November, 2008 October, 2008 September, 2008 August, 2008 July, 2008 June, 2008 May, 2008 April, 2008 March, 2008 February, 2008 January, 2008 December, 2007 November, 2007 October, 2007 September, 2007 August, 2007 July, 2007 June, 2007 May, 2007 Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets. Find premium Chicago Cubs tickets and other Chicago tickets at JustGreatTickets.com. Chicago Cubs Tickets Chicago Tickets ![]() All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License. |
Most Recent Posts
Wednesday, October 10, 2007Examining the components of batting average and BABIPPosted by Derek Carty at 3:00pmI know we've talked about batting average before, but I thought that since we're getting into the off-season now, it would be a good idea to go over some of the concepts behind predicting it and even introduce a few new things concerning BABIP. There are three primary components to batting average, listed below.
You could also consider smaller factors like batting average on bunts, but these are the three biggies. The benchmark for what most people consider a good contact hitter is a .300 batting average (although the number needed to add value to a fantasy team is lower). If you have a .300 hitter on your fantasy team, he is going to provide you some excellent value in that category. The problem is that hitting .300, consistently, is no easy task. In order to truly be a .300 hitter, you either need a solid set of skills in each of the three above categories or amazing skills in at least one of them. For example, a guy with a 95% contact rate and a .316 BABIP would post a .300 batting average without hitting a single home run. But if you drop that contact rate to even 85% (which is still above average), the batting average drops to .269. The BABIP would have to increase to .353 (which is very difficult to do, but again would demonstrate the need to be very strong in at least one category and above average in the second) in order to get the batting average back to .300. Contact rate and home runsContact rate and home runs are the most stable of the three batting average components. When we do a year-to-year correlation test for contact rate, we get a very strong .8305 correlation coefficient. This basically means that you can predict next year's contact rate very well simply by using this year's contact rate. When we do this for home runs (AB/HR to normalize), our correlation coefficient is lower, but still decent, at .6245. Keep in mind that I'm still working on a system for projecting home runs (with infinite thanks to Greg Rybarczyk of HitTracker for his help) that I'm hoping I'll be able to introduce within a week. Please note the following criteria used for the two correlations above: 2004-2007 numbers were used. Players who changed teams mid-year in either Year 1 or Year 2 were excluded. Also, players needed to have at least 250 plate appearances in both Year 1 and Year 2. BABIP testsThat leaves us with BABIP, a critical number for every player in baseball but one that it is highly variable and very difficult to predict. Still, because of how important it is, we need to try and do just this. To do this, we'll use some simple correlations to find which stats are best able to predict BABIP. First, a quick note. For each of these correlations (with the exception of #1), I'm not using straight BABIP. I'm excluding bunts and only including the four outcomes (outfield fly, infield fly, grounder, liner) that can occur when a player is swinging to get a hit to show a clearer picture of a hitter's ability. We'll call this BABIP2 for the sake of easy reference. When we eventually compile our projected batting average, we'll also include bunts separately from BABIP. 1) BABIP correlation from year-to-year Let's see exactly how well BABIP can predict itself. Correlation Coefficient: .3066 Criteria: 2004-2007 numbers were used. Players who changed teams mid-year in either Year 1 or Year 2 were excluded. Also, players needed to have at least 250 plate appearances in both Year 1 and Year 2. Not terrible (considering some of the results we get later), but not very good either considering what we got for contact rate and home runs. This confirms what I said earlier about BABIP being very variable. There is a positive correlation between the two, but it isn't especially strong. Let's see if we can find something better. 2) Walk rate correlation with BABIP2 The logic behind this is that walk rate shows patience and selectivity. Those who wait for good pitches, theoretically, will be more likely to convert the ones they do swing at into hits. Of course, this doesn't take actual hitting ability into consideration. Correlation Coefficient: - .2926 Criteria: 2004-2007 numbers were used. Players needed to have at least 250 plate appearances to be eligible. Wow. Not at all what I was expecting. There's actually a not-all-that-weak negative correlation, meaning the more walks, the lower the batting average. Very surprising. I'd have to think it is because, as I said before, actual hitting ability isn't considered. 3) (Called Strikes + Balls)/(Total Pitches) with BABIP2 Maybe walk rate isn't the best measure of selectivity, so we'll dig a little deeper into the number and use the actual pitch data (a big thank you to Retrosheet for this data). Let's see if the results are any different than they were for walk rate. Correlation Coefficient: 0.0266 Criteria: 2004-2006 numbers were used (Retrosheet doesn't have 2007 numbers up yet). Players needed to have at least 250 plate appearances to be eligible. Well, at least we're in positive territory. The correlation is — for all intents and purposes — non-existent, though. 4) Walks/Strikeouts (BB/K) correlation with BABIP2 I've often heard that walks divided by strikeouts is a good measure of a batter's discipline, or his eye, or his command of the strike zone. Let's see if this has any relationship with BABIP. Correlation Coefficient: - .0196 Criteria: 2004-2007 numbers were used. Players needed to have at least 250 plate appearances to be eligible. Nope. It just doesn't seem like these types of numbers tell us much about BABIP. I definitely think they are useful for different purposes, but for today, they haven't been much help. Let's check out our batted ball data and see if we can do better. 5) Line drive rate correlation with BABIP2 I would expect this one to be much better than walk rate proved to be. Line drives fall for hits, on average, around 71% of the time. Logically, those who hit a lot of them should have higher BABIPs. Let's see if that's the case. Correlation Coefficient: .4169 Criteria: 2004-2007 numbers were used. Players needed to have at least 250 plate appearances to be eligible. Not fantastic, but considering that we're working with BABIP, I definitely think that it is significant. Line drives seem like a good measure to use for projecting BABIP. I just found an interesting post from 2005 by Dave Studeman, in which he produces a general formula for predicting BABIP: LD% + .120. 6) Outfield fly ball BABIP correlation with BABIP2 As David Gassko surmised in his article from a couple of weeks ago, since fly balls have one very stable event (home runs) and lots of easily fielded balls (lazy flies), the guys who have high hit rates on fly balls are probably hitting the ball harder than other players. Let's test this theory on BABIP. Correlation Coefficient: .3061 Criteria: 2004-2007 numbers were used. Players needed to have at least 250 plate appearances to be eligible. Not quite as good as line drive percentage, but it's decent. Let's see how these two can predict themselves. 7) Outfield fly ball BABIP correlation from year-to-year How consistent is outfield fly ball BABIP? Correlation Coefficient: .1635 Criteria: 2004-2007 numbers were used. Players who changed teams mid-year in either Year 1 or Year 2 were excluded. Also, players needed to have at least 250 plate appearances in both Year 1 and Year 2. As you see, while fly ball BABIP is a decent predictor of actual BABIP, it isn't very consistent from year to year. 8) Line drive rate correlation from year-to-year How consistent is a player's line drive rate? Correlation Coefficient: .2653 Criteria: 2004-2007 numbers were used. Players who changed teams mid-year in either Year 1 or Year 2 were excluded. Also, players needed to have at least 250 plate appearances in both Year 1 and Year 2. Not very consistent from year-to-year, but it correlates better than outfield fly BABIP does and is better at predicting BABIP too. Still, it seems like it would be difficult to predict BABIP before the season begins using either of these two. 9) 3 year, unweighted BABIP2 correlation with Year 4 BABIP2 Moving on from batted ball numbers, let's see if several years of a player's BABIP can predict the following year's BABIP with any certainty. Correlation Coefficient: .5843 Criteria: 2004, 2005, and 2006 numbers were combined (but unweighted) to get a player's 3-year BABIP2. This was then compared with that player's 2007 BABIP2. Players needed to have at least 650 plate appearances between 2004 and 2006 and at least 250 plate appearances in 2007 to be eligible. Our best result yet. It seems that, given enough at-bats, a player's true ability to convert balls in play into hits will begin to reveal itself. Keep in mind that I used an unweighted three-year figure and that there were far fewer records than any of our other correlations (just 232 records). I don't have data from other years to work with, but right now this seems like our best bet for predicting BABIP. 10) 2 year, unweighted BABIP2 correlation with Year 3 BABIP2 Since some players haven't yet played 3 years in baseball, I wanted to know if a two-year BABIP would be better than line drive rate and outfield fly BABIP. Correlation Coefficient: .5851 Criteria: 2004 and 2005, and 2005 and 2006 numbers were combined (but unweighted) to get a player's 2-year BABIP2. This was then compared with that player's 2006 and 2007 BABIP2, respectively. Players needed to have at least 450 plate appearances between the first two years and at least 250 plate appearances in the third year to be eligible. Turns out, the correlation coefficient is actually a tiny bit better than the three-year figure. Keep in mind, though, that the three year sample size was somewhat small (if you're curious, there were 487 records in the two-year set). It looks like it should be okay to evaluate guys who have played for two years using this. 11) 3 year, weighted BABIP2 correlation with Year 4 BABIP2 Let's see if the results get any better if we weigh the numbers. Correlation Coefficient: .5812 Criteria: 2004, 2005, and 2006 numbers were combined (and weighted) to get a player's 3-year BABIP2. This was then compared with that player's 2007 BABIP2. Players needed to have at least 650 plate appearances between 2004 and 2006 and at least 250 plate appearances in 2007 to be eligible. Nearly identical results to the unweighted correlation. Very interesting stuff. Closing thoughtsReviewing, the three most important components of a player's batting average are contact rate, home run rate, and BABIP. Contact rate is the most stable, home run rate is second, and BABIP is quite unstable. I think we made some strides, though, in our attempt to find numbers that can predict it with some measure of accuracy. Our best results came from weighted and unweighted multi-year BABIPs, batted ball data gave moderate results, and the stats reflecting patience and selectivity showed almost no effect on BABIP whatsoever. For now, it looks like the best route for predicting BABIP before the season begins will be multi-year BABIP and during the season perhaps a combination of multi-year BABIP and line drive percentage. I'm sure we'll be talking more about and digging deeper into this type of stuff in the future, but I think this is a good start. Also, as I mentioned before, stay on the lookout for a new system for home runs (using HitTracker) in the near future. EDIT: The following corrects for a mistake I made in this article. — D.C. 11/22/07 ErrataIn this article, I had incorrectly calculated BABIP2. This had little affect on most of the correlation coefficients, but a few had significant changes. All of the new correlation coefficients are listed below. 2) Walk rate correlation with BABIP2 — 0.05 3) (Called Strikes + Balls)/(Total Pitches) with BABIP2 — 0.03 4) Walks/Strikeouts (BB/K) correlation with BABIP2 — -0.02 5) Line drive rate correlation with BABIP2 — 0.45 6) Outfield fly ball BABIP correlation with BABIP2 — 0.52 9) 3 year, unweighted BABIP2 correlation with Year 4 BABIP2 — 0.39 10) 2 year, unweighted BABIP2 correlation with Year 3 BABIP2 — 0.37 11) 3 year, weighted BABIP2 correlation with Year 4 BABIP2 — 0.38 Outfield fly ball BABIP gets a big boost, enough to become the top predictor of BABIP2 that we looked at. Unfortunately, as explained in this article, it isn't a very stable event. Line drive rate also got a tick higher, and — as we discussed in the same article — it is somewhat predictable using a three-year figure. 9, 10, and 11 — obviously — are significantly lower than where we had them before. They are still decent, but not great. More work certainly needs to be done in this field. Derek Carty is a 22-year old fantasy baseball analyst residing in New Jersey. In addition to writing for THTF, his work has appeared at Rotoworld (NBC), Sports Illustrated, FOX Sports, and Heater Magazine. In his two years competing in expert leagues, he has won 2 titles with 4 four top three finishes, including a LABR NL title in 2009, making him the youngest person to ever win a major expert league title. Derek is a proud graduate of the MLB Scouting Bureau's Scout Development Program and is a firm believer in the importance of combining stats and scouting. He welcomes questions via e-mail. CommentsNext Post: Player highlight: Fausto Carmona>> <<Previous Post: Which $1 player do you prefer? |