Could you be a baseball scout? Can you judge baseball talent by
watching somebody play? Why not? Many serious fans watch 100 or 150
baseball games a year; surely they must learn something during all
those hours. As Yogi said, “You can observe a lot just by watching.”
Well, as many of you probably know, occasional THT
contributor Tom Tango believes that fans do know how to judge talent,
and he also believes in the wisdom of
crowds, meaning that if you ask enough people a question the
average answer will be pretty good. The result is Tom’s
target="new">Scouting Report, By the Fans, For the Fans.
The Fans’ Scouting Report
Basically, Tom asked fans, anybody who wanted to contribute, to rate
the defensive abilities of major league players. He wanted people who
had seen the player in question in at least 10 games during the 2007
season and asked for a judgment in seven different defensive
categories, which I list here:
- Before/As ball is pitched:
- While ball is in air/on ground
- Acceleration/First few steps
- Velocity/Sprint speed
- After ball is caught (throwing)
- Throwing strength
- Throwing accuracy
The fan/scout was asked to give a score of one to five for each category.
Tom then tabulated the results and massaged the data a bit to
convert the one-to-five scale to a zero-to-100 scale. He also kept
track of how many people contributed to each scouting report (the more
the better) and also reports an “Agreement Level” among the different
Oh, one more important thing: Tom specifically requested that people
not take any stats into consideration when evaluating
fielders. Let’s take a look at an example:
Guerrero, Vlad Instincts FirstStep Speed Hands Release Strength Accuracy 48 43 50 26 67 93 58 Ballots Agreement Overall 29 0.65 52
Twenty-nine fans gave their opinion on Vlad’s defense and the
agreement level of 0.65 is roughly typical. He
scores above average in the three throwing categories (especially
“Strength”), but poorly in “Hands” and below average in “First Step.”
Tom calculates an overall, position-neutral score from the individual
categories and Vlad scores 52, just about average.
Cool, no? This is undoubtedly a great resource, but I have one
question: Can it possibly work? I don’t doubt that fans can recognize
good (or poor) play when they see it, but we fans are also clearly
influenced by a host of factors that go beyond observation:
reputation; past, perhaps out-dated, performance; our own prejudices,
including bias towards our favorite team and players; knowledge of
defensive stats; and probably others that I haven’t
Another issue is how well fans/scouts can actually observe
players. I’m guessing that many of these fans are watching games on
television—how can they judge “first step” or “instincts” when
watching on TV? You have to be watching the player before the ball is
hit to judge those things. And even if you are at the park: are you
going to be staring at J.D. Drew as the pitcher delivers the ball?
You’d have to do that for every pitch in a game in order to see his
reaction to the handful of balls hit his way in a game. Who watches a
game like that?
So, “first step” is not so easy, but how about throwing? The throw is
always captured on TV or by the fan at the ballpark. Furthermore, it’s
fairly easy to judge arm strength and while “release” and “accuracy”
are a bit trickier, if you pay attention, you can judge these
aspects of throwing as well. So why don’t we have a look at how the
fans/scouts rated throwing and see if their observations agree with
Enter the stats
What numbers, you ask? Well, I happen to have some
target="new">results on outfield throwing handy, so we can compare
those numbers to the fans’ report. The first thing to do is to
combine the three throwing categories of the fans’ report (remember,
release, strength and accuracy) into a single arm rating. I’m going to
do the easiest thing and simply average the three values. Here are the
top 10 outfield arms (any outfield position) according to the fans’
10 Best OF Arms, Fans' Scouting Report 2007 Name Arm Score Young, Delmon 92.7 Suzuki, Ichiro 91.3 Victorino, Shane 91.3 Francoeur, Jeff 88.3 Cuddyer, Michael 87.0 Rios, Alex 84.0 Hawpe, Brad 82.7 Hamilton, Josh 82.3 Markakis, Nick 81.0 Pie, Felix 79.0
and here are the trailers (I am merciful, so I only list five):
Pierre, Juan 7.0 Podsednik, Scott 9.0 Bay, Jason 17.7 Damon, Johnny 17.7 Owens, Jerry 20.3 Gibbons, Jay 20.7
If you look back at my article on 2007 outfield arms, you will see
some agreement. My top five right field arms were: Cuddyer, Francoeur,
Young, Victorino and Rios. Wow! These are exactly the first five right
fielders in the fans’ report (although not in the same order). I also had Pierre and Owens as atrocious, Bay was around average, but poor in previous seasons. Damon and Podsednik did not qualify in 2007, but both have been terrible overall in recent seasons. Gibbons has been about average over the past few seasons.
is looking pretty good, so far. How about my top five center field arms? Here
they are, along with their ArmScore from the Fans’ Report:
Top Five Center Fielders, according o statistical analysis Name Arm Score Upton, BJ 57.3 (scouted as 2B) Taveras, Willy 49.7 Edmonds, Jim 77.7 Suzuki, Ichiro 91.3 Freel, Ryan 39.3
Uh, oh. Not so good, is it? Taveras and Upton are rated as
average-ish, while Freel is definitely seen as below average. Ichiro,
considered to have the second-best outfield arm in baseball by the
fans, only manages fourth place in my center field ranking.
Hmmm, looks like we’re going to have to dig deeper to see what’s going
on. Instead of looking at individual results, let’s widen our
perspective a bit.
The plot on the right shows how well the results from my outfield arm
analysis match up with the Fans’ Report. Each point represents a
single outfielder season (between 2005 and 2007), with his Fans’ Arm
Score plotted on the horizontal axis and his runs saved per 200 opps
(my analysis) on the vertical axis. I require at least 50
opportunities in a given season and (of course) only outfielders that
have a scouting report are plotted.
I don’t know what you think, but in my opinion that’s an ugly plot.
Oh, we do see some agreement—if you squint your eyes
and tilt your head slightly, you can see an upward slope to the mass
of points as you move to your right. But I was expecting a tighter
bunching of the points around a straight line. To quantify the
agreement, it’s customary to quote a correlation coefficient: it turns
out to be 0.39. That is not a strong correlation.
Proceed with caution
But, you have to be careful with correlation coefficients, as Tom
Tango himself likes to href="http://www.insidethebook.com/ee/index.php/site/giving_the_finger_to_correlation_coeffic
point out. What I think is happening here is that there is too
much noise in the statistical analysis. Brad Hawpe, who ranked high in
the Fans’ Report, scored 2.7, 11.2 and -4.4 runs per 200 opps from
2005 to 2007. Ichiro had years of 10.6, -1.2, 8.9 and 5.2
(2004-2007). Now, some of the variation may well be due to other
factors, but a large part of it is likely statistical noise.
We can try to reduce the effects of noise by increasing the minimum
number of opportunities—the more opportunities you consider,
the closer you can get to a player’s true talent level. The graph on
the right shows how the correlation coefficient between runs200 and
Arm Score varies as we increase the minimum number of
opportunities. To beef up the sample size for this plot, I combined all results
for the three-year period 2005-2007, and furthermore, I combined
results from the three different outfield positions if a player
played multiple positions in that time frame.
As expected, as we move to more and more opps, we see a stronger
correlation (the rising blue line) because there is less noise in the
data. The red line shows the number of players that meet the minimum opps
requirement. The point here is that there is a strong
correlation between the statistical analysis and the scouting
report. For example, when at least 500 opps are required, the
correlation coefficient rises to 0.78.
So, why don’t we go back to our original plot of runs200 vs. Arm
Score, but this time we use the combined data and ask for at least 300
opportunities for any given player. This gives me 70 outfielders to
look at. Here’s the graph:
Compared to the original version above, here we can see a strong
correlation between runs200 and Arm Score. I’ve superimposed the
“trend line” that best describes the data. Note how the trend line
comes very close to intersecting the point (50,0). In other words, an
average Arm Score (50) is predicting an average runs200 (0). This is
impressive agreement — there is nothing that I’ve done here that
forces the line through that point. This is the wisdom of crowds at work.
I’ve annotated a few players who have performed much better (in green)
or worse (in red) than what the scouts would have predicted. I just
picked these by eye, but for those of you who like numbers, here is a
list of the players who exceeded the scouts’ expectation by the
Most Underrated by Fans/Scouts ArmScore Pred. Actual Difference Soriano_Alfonso 73.0 2.6 11.1 8.6 Cuddyer_Michael 87.0 4.2 11.2 7.0 Taveras_Willy 49.7 -0.3 5.7 5.9 Francoeur_Jeff 88.3 4.4 9.8 5.4 Jones_Jacque 21.0 -3.7 1.6 5.3 Ramirez_Manny 54.3 0.3 5.6 5.2 Floyd_Cliff 30.3 -2.6 1.9 4.5 Lofton_Kenny 29.7 -2.7 1.2 3.9 Ibanez_Raul 32.3 -2.3 1.1 3.5 Rowand_Aaron 44.7 -0.9 2.5 3.4 --------------------------------------- Pred: predicted runs200 from trend line Actual: actual runs200 Difference: Actual - Pred
and here are the underachievers:
Most Overrated by Fans/Scouts ArmScore Pred. Actual Difference Green_Shawn 32.7 -2.3 -8.8 -6.5 Drew_J.D. 62.3 1.3 -4.5 -5.8 Anderson_Garret 58.3 0.8 -3.7 -4.5 Giles_Brian 46.0 -0.7 -4.6 -3.9 Griffey_Jr._Ken 64.0 1.5 -2.2 -3.7 Clark_Brady 38.7 -1.6 -5.0 -3.5 Gathright_Joey 28.0 -2.9 -6.1 -3.2 Jenkins_Geoff 76.0 2.9 -0.1 -3.0 Kearns_Austin 73.7 2.6 -0.2 -2.9 Dye_Jermaine 57.7 0.7 -1.9 -2.6
What makes for a good thrower?
Since the scouts have done a good job of evaluating arms, might we look at
the individual components of arm rating, i.e., strength,
accuracy and release, to see if we can learn something about the
relative importance of these aspects?
Well, the short answer is “no.” The reason is that when you look at
these three categories, you find that there is a strong correlation
between any pair of them, as you can see in the graphic below (which
includes all outfielders in the Scouting Report).
I believe the correlations can have two sources: 1) real correlations
among defensive abilities—after all, good
defensive players often excel in more than one category—and 2) fan
I believe fan bias is playing a role here, because I doubt that the
real correlations are as strong as we are seeing. Furthermore, in the
case of strength vs. accuracy, I might have expected to see little, or
even negative, correlation, not the strong positive correlation we see
in the plot.
What I guess might be happening is that fans are able to judge a
player’s overall throwing skill, but they tend to give a good thrower
high scores in all three categories and conversely for poor
throwers. They are not able to judge independently the three different
throwing categories. That’s my hypothesis anyway.
In any case, the high correlations among the arm
categories makes it impossible to determine the
relative importance of strength, accuracy and release in evaluating
The Fans’ Scouting Report, while not perfect, is a great resource, and
I believe it will be a useful piece of the puzzle in understanding
defensive ability. When you see Tom’s invitation to participate in
the 2008 Scouting Report, do not hesitate to do so.