November 23, 2009

Player Search:

Order Now


The Hardball Times Baseball Annual 2010 is now in development and will ship in mid November! This year's book will feature articles by THT's staff as well as Bill James, Tom Tango and Craig Wright. If you use this link to purchase the Annual, you will be in the first group to receive it and you'll be supporting THT.


And here's the full roster.



Or you can search by:

Sports Tickets

Gear up for baseball season with Chicago White Sox tickets and New York Yankees tickets. LA Angels tickets, Houston Astros tickets, and Atlanta Braves tickets are hot sellers! You can get Boston Red Sox tickets, San Diego Padres tickets or Chicago Cubs tickets for your favorite baseball fan. Coast to Coast Tickets has the best MLB tickets like Minnesota Twins tickets, LA Dodgers tickets, Milwaukee Brewers tickets, New York Met tickets and St. Louis Cardinals tickets.
Find premium Chicago Cubs tickets and other Chicago tickets at JustGreatTickets.com.
Chicago Cubs Tickets
Chicago Tickets
Championship Tickets



Creative Commons License
All content on this site (including text, graphs, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.

Stats Articles


Following are the one hundred most recent articles for the category Stats .

11/23/2009: How Sabermetrics saved my dissertation

by Pizza Cutter

11/19/2009: Offense/Defense number (Part 2)

by Brandon Isleib

11/05/2009: Offense/Defense number (Part 1)

by Brandon Isleib

11/04/2009: Get rid of the DLF

by Joe Distelheim

10/28/2009: Strikeout rates through the years

by Geoff Young

10/13/2009: Who’s going to win the MVP?

by David Gassko

10/06/2009: Man vs. computer

by David Gassko

09/29/2009: THT Dartboard: Week Twenty-five

by Matthew Carruth

09/24/2009: Portrait of a reliever: Firpo Marberry, 1925

by Brandon Isleib

09/21/2009: THT Dartboard: Week Twenty-four

by Matthew Carruth

09/14/2009: WAR vs. Win Shares

by Dave Studeman

09/10/2009: What are little runs made of?

by Colin Wyers

09/01/2009: Park effects and batted ball types

by Harry Pavlidis

08/27/2009: How accurately can we estimate a hitter’s runs? (Part 2)

by Colin Wyers

08/24/2009: THT Dartboard: Week Twenty

by Matthew Carruth

08/20/2009: How accurately can we estimate a hitter’s runs? (Part 1)

by Colin Wyers

08/13/2009: What is Joe Mauer’s true talent level?

by Colin Wyers

08/12/2009: My WAR Graph

by Dave Studeman

08/06/2009: The nominees for minor league player of the year

by Matt Hagen

08/06/2009: Why does Pujols regress to the mean?

by Colin Wyers

08/03/2009: Is Ichiro heading for the Hall of Fame?  Which one?

by Sean Smith

08/03/2009: Applying the Guttridge-Wang trade model to this year’s deadline trades (Part 1)

by Adam Guttridge

07/30/2009: Former top pitching prospects revisited

by Matt Hagen

07/30/2009: A treatise on true talent

by David Gassko

07/30/2009: What’s past is prologue

by Colin Wyers

07/23/2009: Dissecting the DSL

by Jeff Sackmann

07/23/2009: A second look at situational pitching

by Colin Wyers

07/16/2009: Moving past DIPS

by Colin Wyers

07/16/2009: Predicting double play rate

by Dan Turkenkopf

07/09/2009: Fielding stats for college shortstops

by Jeff Sackmann

07/09/2009: Evaluating defense using HITf/x

by Colin Wyers

07/07/2009: Getting closer to park adjustments for PITCHf/x

by Harry Pavlidis

07/02/2009: Stacking the middle

by Dan Turkenkopf

06/30/2009: Using HITf/x to measure skill

by Peter Jensen

06/26/2009: The WPA Inquirer

by Dave Studeman

06/25/2009: The wrong side of 120

by Jeff Sackmann

06/25/2009: Poisoning the well

by Colin Wyers

06/18/2009: How well can we predict ERA?

by Colin Wyers

06/18/2009: Adjusting steals for win value

by Dan Turkenkopf

06/17/2009: The great strikeout debate (Part II)

by Paul Singman

06/12/2009: Extreme environments

by Jeff Sackmann

06/12/2009: See the ball, hit the ball

by Craig Brown

06/10/2009: THT’s Top 100 Prospects

by Matt Hagen

06/10/2009: Mike Pelfrey’s Sinker

by Jonathan Hale

06/04/2009: The one about sample size

by Colin Wyers

06/04/2009: How lucky can one guy get?

by Jonathan Halket

05/28/2009: Putting the scissor to defense (Part 1)

by Colin Wyers

05/21/2009: Who watches the watchers?

by Colin Wyers

05/21/2009: Small College World Series odds

by Jeff Sackmann

05/20/2009: Pedro Feliz and the golden zapatos

by Geoff Young

05/14/2009: Apprehensive yet Comprehensive: Personal Strategies and Secrets for Dominating Your Keeper League

by Matt Hagen

05/14/2009: Thou shalt sow, but thou shalt not reap

by Colin Wyers

05/12/2009: The great strikeout debate

by Paul Singman

05/07/2009: Top 100 Fantasy Baseball Prospects - 5/7/09

by Matt Hagen

05/07/2009: Is Tom Mendonca the second coming of Brooks Robinson?

by Jeff Sackmann

05/07/2009: Can a team get picked off first?

by Colin Wyers

05/01/2009: Top 100 Fantasy Baseball Prospects - 5/1/09

by Matt Hagen

04/24/2009: Checking the leaderboards

by Craig Brown

04/16/2009: The great run estimator shootout (part 2)

by Colin Wyers

04/13/2009: Measuring greatness (part 2)

by Mike Carminati

04/09/2009: The great run estimator shootout (part 1)

by Colin Wyers

04/02/2009: The death of superman

by Colin Wyers

03/30/2009: 29 players I think the THT projections got wrong

by David Gassko

03/23/2009: The statistical coaching aid

by Adam Guttridge

03/20/2009: The worst thing a batter can do

by John Walsh

03/19/2009: Statistical Shenanigans (part 2)

by John Beamer

03/16/2009: Measuring the NAIA

by Jeff Sackmann

03/11/2009: Rejected nuggets

by Dave Studeman

03/09/2009: Dorkapalooza 2009: The sports analytics conference at MIT

by Sal Baxamusa

03/09/2009: (Somewhat) sabermetric similarity scores

by Chris Jaffe

03/05/2009: What would we do, baby, without us?

by Brandon Isleib

03/04/2009: Confessions of a DIPS apostate

by Mike Fast

03/03/2009: Even smaller colleges

by Jeff Sackmann

02/25/2009: Beyond OPS: filling in the gaps

by John Walsh

02/20/2009: The color of clutch

by Tom M. Tango

02/19/2009: What’s a batted ball to do?

by Colin Wyers

02/17/2009: How good is NCAA Division 2?

by Jeff Sackmann

02/13/2009: Exploring contact quality

by Dan Turkenkopf

02/12/2009: The many faces of average

by Sky Kalkman

02/10/2009: Pitch sequencing

by Josh Kalk

02/10/2009: Predicting replacement level

by Paul Singman

02/05/2009: How to measure a player’s value (Part 3)

by Colin Wyers

02/05/2009: Breaking down Division 1 baseball

by Jeff Sackmann

02/03/2009: Pitch sequence: High fastball then curveball

by Josh Kalk

01/29/2009: How to measure a player’s value (Part 2)

by Colin Wyers

01/27/2009: First pitch fastballs, and who likes ‘em

by Josh Kalk

01/22/2009: How to measure a player’s value (Part 1)

by Colin Wyers

01/20/2009: That was a strike?

by Josh Kalk

01/16/2009: What’s new at Fangraphs?

by Eric Seidman

01/15/2009: Postseason probability added

by Dave Studeman

01/15/2009: An unremarkable century

by Brandon Isleib

01/13/2009: BABIP’s relationship to hitters

by Paul Singman

01/12/2009: The Wonder of Rickey

by Chris Jaffe

01/08/2009: Eat The Rich (Part 1)

by Colin Wyers

01/06/2009: Who was better? Brian Downing vs. Jim Rice

by Sean Smith

12/29/2008: The drama index

by Dave Studeman

12/18/2008: Yaz v. Manny (Part 2—defense counts)

by John Walsh

12/11/2008: Category influence

by Michael Lerra

12/11/2008: A 10th man?

by Colin Wyers

12/11/2008: Season leverage index

by Dave Studeman

<< Click here to return to the category list.



November 21, 2009

HR/FB Park Factors

Just a quick hit to share park factors for HR/FB rate.

I used BIP data and the methodology from Baseball Reference to determine simple HR/FB park factors for 2009 and 4-year weighed factors (weights are 5,3,2,1).

Update: My spreadsheet was thrown off by the Rays' name change. I've corrected the numbers below

Without further adieu, here's the list:

Team        Park                         2009    4 Year
Angels      Angel Stadium                 110      96
Astros      Minute Maid Park              104     108
Athletics   McAfee Colisuem                95      92
Blue Jays   Rogers Centre                 105     108
Braves      Turner Field                   90      95
Brewers     Miller Park                   108     106
Cardinals   Busch Stadium                  86      84
Cubs        Wrigley Field                  97     103
DiamondbacksChase Field                    99     106
Dodgers     Dodger Stadium                 89      95
Giants      Pacific Bell Park             104      95
Indians     Jacobs Field                   75      88
Mariners    Safeco Park                    95      96
Marlins     Dolphins Stadium              109      99
Mets        Citi Field                     98      98
Nationals   Nationals Stadium              91      92
Orioles     Oriole Park at Camden Yar     109     115
Padres      PETCO Park                     73      75
Phillies    Citizens Bank Park            109      94
Pirates     PNC Park                      105      94
Rays        Tropicana Field               110     111
Rangers     The Ballpark at Arlington      98      97
Red Sox     Fenway Park                    98      90
Reds        Great American Ballpark       121     114
Rockies     Coors Field                   103     112
Royals      Kaufman Stadium                73      78
Tigers      Comerica Park                  94     101
Twins       Metrodome                     109      96
White Sox   US Cellular Field             115     118
Yankees     New Yankee Stadium            130     130


The Mets and the Yankees Park Factors are one season only

The Nationals Park Factor is two seasons, weighted at 5 and 3
Posted by: Dan Turkenkopf


October 13, 2009

PITCHf/x data from Arizona Fall League

Per MLBAM's Cory Schwartz, the Arizona Fall League has PITCHf/x camera systems operational at the Surprise and Peoria parks. PITCHf/x data will be available for all your favorite AFL participants, including Phillippe Aumont, Ian Kennedy, and, perhaps, super-uber-star prospect Stephen Strasburg and everyone's favorite Phillies minor league blogger Michael Schwimer, if the stars align so that either pitches at Surprise or Peoria.

Raw XML-format PITCHf/x data can be found here.

Posted by: Mike Fast


September 18, 2009

Is Sportvision ruining baseball?

This morning the Baseball Think Factory newsblog published a piece by Diane Grassi in which she details her beefs with the use of PITCHf/x data to grade umpires and worries about the impact that forthcoming ball tracking technologies from Sportvision will have on the effectiveness of scouting.

To extent that the adoption of new technologies always results in the degradation of skills with older technologies, she probably has a point. The advent of the typewriter and the word processor have combined to deal a heavy blow to the art of penmanship amongst the masses. Better automated ball tracking will probably render some currently essential skills in the baseball industry obsolete or quaint over time. It does not follow, however, that the human element will depart the game of baseball along with it. Did the value of good writing go out the window with the quill?

In addition to my disagreement with her conclusions, I would also like to set the record straight on some errors of fact about the PITCHf/x, HITf/x, and FIELDf/x systems in her article, particular some of the erroneous facts she states in support of her argument that the use of PITCHf/x for umpire grading is fatally flawed. Her information on the Sportvision systems seems to come from an interview with Ryan Zander, director of business development at Sportvision, and perhaps an unnamed source at Major League Baseball.

She alleges that the umpires have been graded against an inconsistent system, newly introduced and not applied evenly across all stadiums.
During the 2008 MLB season, the PITCHf/x camera system was installed in every major league park – with certain exceptions made for the last year of Yankee and Shea stadiums in New York, as both the Yankees and Mets relocated to new stadiums in the 2009 season. The object of the PITCHf/x system was to gather data from the stadiums in order to composite requisite information for the camera system technology to go live in 2009.

Data was collected during the 2008 season by the PITCHf/x system that included tracking nearly all pitches thrown for the entire season for supposedly all 30 teams, totaling approximately 700,000. And that data is now being used as the base measure to evaluate MLB umpire accuracy for 2009. – Unfortunately, the umpiring data for the new Yankee Stadium and the Mets’ Citi Field was not included; unaddressed publicly by MLB.

In fact, the system was installed and brought live in all parks but two, Baltimore and Washington, during the 2007 season. This included installations in old Yankee Stadium and Shea Stadium in 2007. Baltimore and Washington were added to begin the 2008 season. PITCHf/x data for 2009 does include new Yankee Stadium and Citi Field. I can't see any reason why MLB would choose not to include that data in the umpire grading data; if Ms. Grassi has a source that says they don't, I'd love to know.

She then turns her argument against umpire PITCHf/x grading to allege that Sportvision and MLB don't understand the rule book strike zone definition.
PITCHf/x takes 25 pictures of the ball in flight between the pitching mound and home plate. Sportsvision® software then uses a ‘best fit’ algorithm in order to calculate compensation for different variables of the ball’s flight path, including the position of the ball when it crosses the plate.

But here is where the disparity arises, as a strike is not called at the front of the plate but where it crosses the plate as it makes its way into the catcher’s glove. The camera, however, starts reporting data 5 feet in front of home plate; reminiscent of the ill-timed traffic light camera that incorrectly tickets a driver for going through a red light while traveling through the tail end of a yellow caution light in an intersection.

Here again she is simply incorrect. It is true that MLBAM reports the pitch location at the front of the plate for its entertainment-focused Gameday application. However, the data used for grading umpires contains knowledge of the whole trajectory of the pitch, and Sportvision's umpire grading does take into account the 3-dimensional nature of the zone over home plate. In fact, the umpire grading system offers the umpires a measure of leniency, giving them a two-inch margin around the 3-D zone and considering factors such as the position of the catcher's glove in counting calls in the umpire's favor.

She also has some misunderstanding about the naming, nature, and capabilities of the two newest systems from Sportvision: HITf/x, which is the calculation of initial batted ball speed and direction from existing PITCHf/x camera footage, and the as-yet-unnamed but popularly-called FIELDf/x, which will use new cameras mounted to capture a view of the whole field in order to track ball and player movements throughout the whole game.
For after PITCHf/x, the upcoming HITf/x will be used for scouting in the not too distant future by MLB teams and it also will be a supposed tool that will measure every aspect of every player’s mechanics. Such technology will put sabermetrics to shame and will again rely upon technology which again, the naked eye cannot see on its own. “Every moving event within an actual game will be tracked,” according to Sportsvision’s General Manager of Baseball Products, Ryan Zander. It will track the pitcher, the ball and the fielder with individual stats.

HITf/x is already in existence, and the so-called FIELDf/x is coming, but neither measure a player's mechanics. FIELDf/x measures a player's location on the field over time.

It appears Ms. Grassi's not quite clear on what type of scouting these systems could be used for. For scouting of players already in the major leagues, yes, whether for advance scouting of upcoming opponents or possible trade targets or coaching and improvement of a team's own players, this system does have scouting applications. However, it has no use in the sense she uses scouting in her article, that is, finding future players like Derek Jeter on the high school ball fields around the country. Sportvision is not encroaching on the domain of the amateur talent scout.

She also seems concerned that this system uses technology that can see things the naked eye can't see on its own, as if its secret maneuverings can be used like a hacked Diebold e-voting machine to steal an election, arbitrarily anointing good players or umpires without regard to the vast and valuable store of baseball knowledge handed down over the decades. However, these systems in fact mostly track things that the naked eye can see, like where a pitch was located, or how hard a ball was hit, or how far a fielder had to run to catch a sinking line drive.

It's just that our naked eyes and unassisted brains are not very good at measuring and cataloging these things they see. Automated tracking systems from Sportvision allow us to remember much more accurately, find otherwise hidden patterns, and quickly query large data sets for the answers to multitudes of questions. All of this enriches the experience of baseball for many, and, I would hope, enriches the play of the game on the field as well.

Such technology does not put sabermetrics to shame; it gives sabermetricians new and powerful tools and integrates them with the flow of the game on the field in ways that were heretofore impossible and unimaginable. No longer will the accusations against sabermetricians of being a blogger in the basement or having a nose stuck in a spreadsheet hold much water. The sabermetrician in tune with these new data sources and committed to understanding the game of baseball with them will be more "on the field" than the writer in the press box. He will have the ability to gain an experience of the game as meaningful and helpful to the player as the scout sitting behind home plate. In fact, the enlightened sabermetrician will learn to converse with that scout as an equal, and the enlightened scout will enlist these new sources of knowledge to leverage his knowledge and experience of the game in new ways.

Greater collaboration between new and old, "beer and tacos" to quote Dayn Perry, will become the name of the game for successful franchises.
Posted by: Mike Fast


August 24, 2009

2009 Fan’s Scouting Report - call for ballots

As he does every year, Tom Tango is compiling the Fans' Scouting Report. He is seeking help from baseball fans to rate the defensive abilities of the players they have watched this season.
Baseball's fans are very perceptive. Take a large group of them, and they can pick out the final standings with the best of them. They can forecast the performance of players as well as those guys with rather sophisticated forecasting engines. Bill James, in one of his later Abstracts, had the fans vote in for the ranking of the best to worst players by position. And they did a darn good job.

There is an enormous amount of untapped knowledge here. There are 70 million fans at MLB parks every year, and a whole lot more watching the games on television. When I was a teenager, I had no problem picking out Tim Wallach as a great fielding 3B, a few years before MLB coaches did so. And, judging by the quantity of non-stop standing ovations Wallach received, I wasn't the only one in Montreal whose eyes did not deceive him. Rondel White, Marquis Grissom, Larry Walker, Andre Dawson, Hubie Brooks, Ellis Valentine. We don't need stats to tell us which of these does not belong.

What I would like to do now is tap that pool of talent. I want you to tell me what your eyes see. I want you to tell me how good or bad a fielder is. Go down, and start selecting the team(s) that you watch all the time. For any player that you've seen play in at least 10 games in 2009, I want you to judge his performance in 7 specific fielding categories.

If you've watched a lot of baseball in 2009, or at least enough to meet the guidelines, please participate in compiling this valuable resource.
Posted by: Mike Fast


August 07, 2009

If you’re happy and you know it, get on base

Ah, the Saber-sphere is all abuzz with talk of regression to the mean.  Regression to the mean is a fairly simple concept.  If, over the past four years, you have a player who has had HR/PA rates of 2.8%, 1.9%, 2.3%, and 2.4%, then suddenly, his rate goes to 7.3%, what should you expect in the next year?  (The correct answer is 2.6%, at least that's what Brady Anderson did in 1997.) 

Why not expect 7% again?  Baseball fans (and a few front office folk) are remarkably good at coming up with justifications for why one should expect 7%.  They'll might say, "That year, Brady developed a new swing/changed his routine/changed his diet/began dating Madonna.  That must be the reason for his sudden power outburst!"  (The more cynical among you might suggest more nefarious reasons*.)  How about another explanation?  Brady Anderson got insanely lucky in 1996.  It's not often that fate smiles that kindly on one man for such a short period of time, but... how to explain this without referring to Kevin Federline... let's just say it doesn't happen very often.

After a few years worth of data points from 1992-1995, we have a decent idea that in reality Brady Anderson is the kind of guy who hits a home run once every 40 times to the plate (2.5%).  In other words, we can be pretty sure that's Brady's true talent level.  When he outshot that true talent level in 1996, it made sense that he was due to come back down to earth the next year (which he did).  Or in fancy statistical terms, he regressed to his own mean.  His performance regressed (got worse), due to the fact that deep down, he was playing over his head the year before, and the next year, he went back to doing what he usually does.

Exactly how to incorporate regression to the mean is the great knuckleball of Sabermetrics.  There are as many theories on how to do so as there are Sabermetricians who have looked at the question.  This is because what folks are really talking about is not "how do I regress to the mean mathematically?"  That's actually really easy.  The real question is "How do we estimate a player's true talent level?"  In other words, what do I regress back to?  What is this player really capable of?

Colin Wyers wrote a bit on true score theory in a recent THT article.  In the piece, he said that a player's performance is a function of his true talent level, random error (aka luck), and bias in measurement.  He made me happy by including measurement bias in his conceptualization (although he then politely dismissed it).  I still think there's one extra missing piece that he hadn't considered.  Colin began to hint at that missing piece when he talked about Ichiro, who gets a hit in roughly 30% of his at-bats.

"Moreover, based on all those factors--and of course many others--a player's true talent level changes from moment-to-moment. Ichiro may have a 30 percent chance of getting a hit in one at-bat, but if his jock strap starts to itch, perhaps that goes down to 29 percent the next. On the other hand, if someone in the dugout makes a funny joke(auth note: in Japanese? - P.C.) that puts Ichiro in a good mood, his true talent could go up to 31 percent so long as that good mood lasts."

The actual equation should look like:  Observed performance = true talent + measurement bias + contextual factors + luck/random error.

If there is a great sin of Sabermetrics, it's that we (and I happily include myself in that pronoun) have treated players as though they were Strat-o-matic cards.  That is to say that they don't respond in the least to what's going on around them, which doesn't make common sense (although common sense is not a proof of anything...)  We act as if it's as if it's just a matter of finding the right algorithim based on last year's stats plus this year's stats times prime rate minus the square of blah blah blah... After that, we know what a player has the probability to do.  And he'll do it no matter what situation he is in.

Or will he?  Colin correctly points out that we won't be able to know everything.  (I frankly don't want to know if Ichiro's jock strap starts to itch.)  But there are some things that we can know, and know them rather easily, that might make a big difference.  Let's take a truism in life.  It's a lot easier to do your job when you are in a good mood than when you're in a bad mood, and overall, you're probably better at the job in a good mood.  Does it apply in baseball?  Let's take the simplest rough proxy for a good mood that there is: is my team winning?


Click for more...

Posted by: Pizza Cutter


Click here for more THT Notes.