A Quick Comparison of UZR and Plus/Minus

FanGraphs has now added the Plus/Minus system into their already impressive array of baseball metrics, making their site even more impressive. Now that it’s freely accessible, we can export the data rather easily and dissect it as we please. So, let’s cut it up a bit and take a look at how it compares to UZR*.

First, Range:

Range	r	r^2	MAE	Max	#10+
1B	0.778	0.606	1.4	14.6	9
2B	0.809	0.655	1.74	16.4	17
3B	0.871	0.759	1.6	16.7	10
SS	0.770	0.593	2.14	19.5	42
LF	0.797	0.635	1.41	17.9	10
CF	0.750	0.562	2.03	18.5	27
RF	0.769	0.592	1.55	16.9	20

All in all, the systems usually rate players within 2 runs of one another, which is nice to see. “Max,” by the way, refers to the maximum absolute difference between UZR and +/-. In other words, it is the largest difference in a player’s range rating. To be honest, I’m shocked at the difference. It’s not as if we’re looking at one or two outliers, either- “#10+” indicates the amount of players that showed a difference of 10 runs or more between the systems. For each position, we’re looking at some players that are being rated around 1.5 to 2 wins differently. The lack of agreement at shortstop (relative to other positions) is even more surprising to me. The “max” players, with their Plus/Minus range and UZR range, respectively:

1B: Mark Teixeira (2003), +21, +6.4
2B: Ian Kinsler (2007), +5, -11.4
3B: Scott Rolen (2003), -5, +11.7
SS: Rafael Furcal (2005), +20, +0.5
LF: Manny Ramirez (2003), -14, +3.9
CF: Andruw Jones (2005), +7, +22.3
RF: Ken Griffey Jr. (2007), -5, -21.9

EDIT 4/13: Here are the figures for “qualified” players (minimum ~900 innings):

Range	r	r^2	MAE
1B	0.796	0.633	3.51
2B	0.802	0.643	4.73
3B	0.877	0.768	4.32
SS	0.754	0.568	5.82
LF	0.831	0.690	4.53
CF	0.735	0.541	5.48
RF	0.820	0.672	4.58

The average error looks better now. Shortstop and center field have the highest level of disagreement, while third base looks to have the best agreement.

And, of course, the arm/double play ratings:

EDIT:
David Appelman has pointed out that the arm ratings are not on the same scale, which throws the numbers off a bit.

DP/ARM	r	r^2	MAE	Max
2B	0.584	0.342	0.63	6.6
SS	0.774	0.600	0.41	3.5
LF	0.700	0.491	0.76	8.4
CF	0.742	0.551	0.86	10.5
RF	0.761	0.579	0.86	9.3

Oddly enough, as much disagreement there is at shortstop in terms of range, the exact opposite is true of double play ratings. Second base looks a bit fishy to me, but multiple tests have given the same result. Overall, we’re still looking at an average error of about 1 run per position, but some of the differences are striking. Aaron Rowand’s arm rating in 2007 was +11 compared to a UZR rating of +0.5, a full win of value. Richard Hidalgo received close to a win of value in 2004, and Carl Crawford was 8.4 runs better according to Plus/Minus in 2005. Robinson Cano’s 2007 Plus/Minus rated him as a +9, while UZR suggested a more modest +2.4, and Michael Young’s 2006 Plus/Minus rated him as a +1 at turning the DP, while UZR rated him as a -2.5.

While the two systems track each other quite well, it’s interesting to see some of the large discrepancies between them. I don’t know the exact reason for why this is- I’ll leave that up to the Mitchel Lichtmans and the John Dewans of the world to discuss.** I get the feeling this subject is going to be heavily discussed for quite a while, and it’ll be interesting to see the work that is to come.

*All information here from 2003-2009.

**You can read about the differences between the systems here.

Edit: Changed figures at 1:15 AM PST- I overlooked ErrR in UZR’s range component. Mea maxima culpa.


8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Colin Wyers
13 years ago

Any cutoffs as far as playing time? And were you comparing DRS to UZR? Because that would include the Arm/DP ratings in the first comparison.

JT Jordan
13 years ago

No cutoff for playing time.  As for the first comparison, I used the “range” portion of the metrics- so Arms and DP were excluded.

Colin Wyers
13 years ago

Probably should include Error runs in there, if you haven’t already.

You’re probably moderately over reporting the correlation and dramatically understating the average error, then.

Remember, UZR and Plus/Minus can be boiled down to:

Rate * Opportunity

The opportunity component is the same (or pretty close) between systems. So that’s “inflating” the correlation a little.

And, since these aren’t rates, the MAE is reported based upon the average player in your sample, who has very little playing time (about 30 games worth, give or take). Prorate everything out to 150 games, and you’re probably looking at an MAE of about 6-7.

JT Jordan
13 years ago

That’s a very good point.

When I get the chance, I’ll re-do it with the “qualified” players and incorporate Error Runs into UZR’s Range portion.

JT Jordan
13 years ago

The “range” portion has been fixed- I need to get some shuteye, and will take care of the rest (hopefully!) soon.

dkappelman
13 years ago

You should know that the ARM ratings for plus-minus and UZR are not on the same scale.  ARM for Plus-minus is not zeroed out and there are some excess of +250 runs each year.

JT Jordan
13 years ago

Thanks for pointing that out, David.  I’m assuming those arm ratings are absolute runs saved.  Any chance they’ll be normalized in the future?

JT Jordan
13 years ago

Updated with qualified players.