Introducing PrOPS

Have you ever been following a team and noticed players who seem to be continually lucky or unlucky? Maybe there’s a player who continually dinks the ball between the infield and the outfield. And what about the guy whose spraying liners all over the field but can’t crack the Mendoza line, because there always seems to be a defender in the right spot? There’s a good chance that most of the noisy outs and swinging bunts come out in the laundry by season’s end — they may not entirely cancel out for all players — but the numbers we look at now to evaluate hitters, such as OPS, are tainted. And while there is no such thing as a perfect metric to evaluate how well a player is playing absent luck, I think there may be a way to get a better grasp on how well players are performing than just relying on the standard raw statistics.

Early in the season, things haven’t had a chance to even out yet. We see historically poor players putting up good numbers (Brian Roberts), and historically good players putting up bad numbers (Bernie Williams). How much of a player’s April OPS is a product of chance, and how much reflects the quality of play? Unfortunately, most of the data we use to evaluate players is based on scorebook outcomes (single, double, walk, etc.), and therefore the numbers themselves reflect both random chance and the quality of play. A double gained on an outfielder’s untied shoes counts the same as a liner to the gap in the stats. And we would expect players who are putting up the latter type of doubles are better than those reaping the benefits of funny bounces and bad personnel decisions. The recent influx of new data provided by Baseball Info Solutions through The Hardball Times provides a possible way to separate out good/bad play from lucky/unlucky outcomes.

I set out to estimate the impact of certain areas of player performances on their season’s OPS using the 2004 season. In particular, I was curious in the types of batted balls (line drives, flyballs, etc.) players were hitting. Is there a correlation between these variables and hitting success? If so, maybe we can learn something about how “good” players are actually playing by looking at this data. To begin, I looked at several different combinations of variables to find the model that best predicted player OPS in 2004 using linear regression estimation. This model uses estimated weights of hitting performances that are not necessarily officially scored outcomes to generate a predicted OPS, or PrOPS. With this model I can evaluate players by the process with which they reached these outcomes; thereby, hopefully separating useful information from the noise of raw statistics.

The model that best predicted a player’s OPS in 2004 included the following variables:

  • Line drives per batted ball
  • Groundball-to-flyball ratio
  • Walk rate
  • Hit-by-pitch rate
  • Strikeout rate
  • Home run rate
  • Home park of the player

While many of these variables are official scorebook outcomes, we know that players do happen to have skills in these areas, and that these skills translate directly and indirectly into a player’s OPS. I am most concerned with the random bounces of batted balls in play, which is why I included line drives and the groundball-to-flyball ratio in the model. it turns out that these variables are important in predicting a hitter’s OPS. The R2 of the overall regression model was .81, which indicates that about 80% of the differences in OPS from player to player were explained by the changes in the included variables. And while we think of luck canceling out over the course of the season, here are lists of the top-25 under/over-performers of 2004 measured as a percent of the player’s actual OPS (minimum 400 plate appearances).

OPS: Actual OPS for 2004
PrOPS: Predicted OPS
PrOPS+: Absolute difference between OPS and PrOPS — a positive PrOPS+ indicates observed performance better than predicted while a negative PrOPS+ indicates observed performance worse than predicted.
PrOPS%: The difference between OPS and PrOPS expressed as a percent of OPS.

2004 Top-25 Under-Performers

Rank    First   Last            OPS     PrOPS   PrOPS+  PrOPS%

1       Desi    Relaford        0.601   0.708   -0.106  -17.72%
2       Scott   Spiezio         0.634   0.740   -0.105  -16.63%
3       Rafael  Palmeiro        0.796   0.898   -0.102  -12.86%
4       Jason   Phillips        0.624   0.702   -0.079  -12.66%
5       Brad    Ausmus          0.631   0.704   -0.073  -11.52%
6       David   Eckstein        0.671   0.737   -0.066   -9.80%
7       Chipper Jones           0.847   0.930   -0.083   -9.79%
8       Tony    Batista         0.726   0.793   -0.066   -9.11%
9       Joe     Crede           0.717   0.781   -0.064   -8.91%
10      Barry   Bonds           1.422   1.537   -0.115   -8.11%
11      Rob     Mackowiak       0.739   0.799   -0.060   -8.05%
12      Craig   Counsell        0.648   0.700   -0.052   -8.04%
13      Jose    Castillo        0.665   0.718   -0.053   -7.95%
14      Aaron   Miles           0.697   0.751   -0.054   -7.81%
15      Placido Polanco         0.786   0.847   -0.061   -7.72%
16      Steve   Finley          0.828   0.891   -0.063   -7.63%
17      Ramon   Hernandez       0.818   0.879   -0.062   -7.56%
18      A.J.    Pierzynski      0.729   0.783   -0.055   -7.50%
19      Toby    Hall            0.666   0.716   -0.050   -7.45%
20      Alex    Gonzalez        0.689   0.739   -0.050   -7.24%
21      Dmitri  Young           0.816   0.875   -0.059   -7.21%
22      Sammy   Sosa            0.849   0.909   -0.060   -7.03%
23      Bill    Mueller         0.811   0.868   -0.056   -6.93%
24      Orlando Cabrera         0.631   0.672   -0.042   -6.60%
25      Matt    Lawton          0.787   0.839   -0.052   -6.59%

2004 Top-25 Over-Performers

Rank    First   Last            OPS     PrOPS   PrOPS+  PrOPS%

1       J.T.    Snow            0.958   0.846   0.112   11.72%
2       Ichiro  Suzuki          0.869   0.774   0.095   10.93%
3       Melvin  Mora            0.981   0.895   0.086    8.74%
4       Jack    Wilson          0.794   0.728   0.066    8.34%
5       Erubiel Durazo          0.919   0.842   0.076    8.30%
6       Aaron   Rowand          0.905   0.830   0.075    8.26%
7       Lyle    Overbay         0.862   0.792   0.070    8.17%
8       Todd    Helton          1.088   1.003   0.085    7.84%
9       David   Newhan          0.814   0.753   0.062    7.57%
10      Carlos  Guillen         0.921   0.853   0.069    7.46%
11      Travis  Hafner          0.993   0.919   0.074    7.42%
12      Mark    Loretta         0.886   0.822   0.064    7.21%
13      Lance   Berkman         1.016   0.944   0.072    7.07%
14      Chone   Figgins         0.770   0.717   0.052    6.79%
15      Juan    Rivera          0.828   0.772   0.056    6.76%
16      Alexis  Rios            0.720   0.674   0.047    6.49%
17      Carl    Crawford        0.781   0.732   0.050    6.34%
18      Ivan    Rodriguez       0.893   0.837   0.056    6.23%
19      Jimmy   Rollins         0.803   0.753   0.050    6.19%
20      Ray     Durham          0.848   0.798   0.050   5.89%
21      Jason   Bay             0.907   0.855   0.052    5.78%
22      Joe     Randa           0.751   0.708   0.043    5.76%
23      Juan    Uribe           0.833   0.786   0.046    5.58%
24      Bobby   Abreu           0.971   0.918   0.053    5.50%
25      Albert  Pujols          1.072   1.013   0.059    5.49%

Desi Relaford wins the award for worst luck of 2004, while J.T. Snow had the best luck. Now, when I say “luck” I want to be clear as to what I mean. Given the batting statistics included in the regression, PrOPS tells us what all other players in MLB did, on average, based on the variables included in the regression model. You can think of PrOPS as similar to DIPS for pitchers. It is entirely possible that some of these players got lucky with hitting line drives, striking out, etc.; however, given their actual numbers for these events we would have expected them to perform much differently.

Now that I have established a baseline impact for the batting statistics on player OPS, I can apply the model to 2005. With only about a fifth of games finished for the season, it’s much less likely that good and bad bounces have had time to even out. The model can help us pull out how well players are actually playing by removing some luck. I’m not saying this is perfect, and players may be getting lucky with their hit types, but it’s all we’ve got to work with at the moment.

Now, let’s use the model to tell us what OPS a player ought to have in the current season based on their hitting peripherals. I have calculated PrOPS stats for every MLB player with at least one plate appearance. You can view the stats by team for the AL and NL. Here are the lists of the top under/over performers for 2005.

2005 Top-25 Under-Performers

Rank    First   Last            OPS     PrOPS   PrOPS+  PrOPS%
1       Tike    Redman          0.454   0.806   -0.353  -77.75%
2       Aaron   Boone           0.457   0.751   -0.294  -64.44%
3       Jose    Molina          0.470   0.770   -0.300  -63.82%
4       Luis    Rivas           0.477   0.769   -0.293  -61.35%
5       Miguel  Olivo           0.362   0.568   -0.206  -56.77%
6       Nomar   Garciaparra     0.405   0.619   -0.215  -53.13%
7       Keith   Ginter          0.554   0.842   -0.288  -51.96%
8       Wilson  Valdez          0.466   0.685   -0.219  -46.94%
9       Jay     Payton          0.579   0.851   -0.272  -46.91%
10      Jack    Wilson          0.466   0.675   -0.209  -44.80%
11      James   Hardy           0.457   0.639   -0.182  -39.74%
12      J.D.    Closser         0.470   0.649   -0.179  -38.15%
13      Ty      Wigginton       0.523   0.703   -0.180  -34.48%
14      John    Buck            0.474   0.637   -0.163  -34.39%
15      Quinton McCracken       0.515   0.682   -0.166  -32.31%
16      Jason   Kendall         0.573   0.754   -0.180  -31.48%
17      Jose    Hernandez       0.567   0.741   -0.174  -30.66%
18      Richard Hidalgo         0.562   0.730   -0.169  -30.00%
19      Yadier  Molina          0.475   0.616   -0.140  -29.53%
20      Placido Polanco         0.592   0.766   -0.174  -29.43%
21      Jason   LaRue           0.536   0.685   -0.149  -27.79%
22      Casey   Blake           0.669   0.851   -0.183  -27.30%
23      Marcus  Thames          0.654   0.833   -0.178  -27.27%
24      Eric    Byrnes          0.633   0.803   -0.170  -26.82%
25      Brad    Ausmus          0.567   0.710   -0.143  -25.26%

I would look for these guys to rebound. It’s not that we really needed a regression model to tell us this, but we can see that players who put up similar peripherals in 2004 performed much better than these guys have shown in their stats so far this season. So hang in there Tike, because better days are coming if you keep hitting like you have been.

What about players who may be looking for a fall.

2005 Top-25 Over-Performers

Rank    First   Last            OPS     PrOPS   PrOPS+  PrOPS%

1       Jason   Ellison         1.005   0.754   0.251   24.99%
2       Carlos  Guillen         1.015   0.772   0.243   23.97%
3       Bill    Hall            0.796   0.622   0.174   21.84%
4       Brad    Wilkerson       0.851   0.678   0.173   20.32%
5       Alex    Sanchez         0.829   0.669   0.160   19.32%
6       Ryan    Freel           0.917   0.745   0.172   18.78%
7       Ricky   Ledee           0.907   0.740   0.167   18.46%
8       Craig   Biggio          0.873   0.716   0.156   17.89%
9       Vinny   Castilla        0.901   0.745   0.156   17.27%
10      Shea    Hillenbrand     0.894   0.747   0.147   16.48%
11      Derrek  Lee             1.224   1.022   0.202   16.47%
12      Justin  Morneau         1.289   1.077   0.212   16.44%
13      Freddy  Sanchez         0.787   0.659   0.127   16.17%
14      Carlos  Beltran         0.881   0.752   0.129   14.69%
15      Rob     Mackowiak       0.770   0.659   0.110   14.31%
16      Nook    Logan           0.816   0.700   0.115   14.12%
17      Kenny   Lofton          0.938   0.806   0.132   14.12%
18      Cliff   Floyd           1.047   0.900   0.146   13.99%
19      Mike    Sweeney         1.016   0.878   0.138   13.56%
20      Brandon Inge            0.877   0.760   0.118   13.39%
21      Nick    Johnson         0.924   0.801   0.124   13.38%
22      Frank   Catalanotto     0.768   0.666   0.101   13.19%
23      Clint   Barmes          1.082   0.940   0.142   13.16%
24      Jacque  Jones           0.990   0.860   0.130   13.09%
25      Chipper Jones           1.111   0.967   0.145   13.01%

Jason Ellison, with his .500 BABIP, certainly won’t continue at his same pace. And when Alex Sanchez returns to being Alex Sanchez by the All-Star break, let’s not say it was the steroids wearing off. These are the guys you may want to try to unload in your fantasy league.

What about two guys who are off to hot and cold starts, Brian Roberts and Bernie Williams? Are these just statistical anomalies? Well, they might be, but they would have to be anomalies in the batting peripherals that go into the model. Roberts is putting up numbers a little better than predicted (OPS =1.111 , PrOPS = 1.058), but he’s still playing well. Bernie is playing poorly (PrOPS = 0.703), but not as bad as his stats show (OPS = 0.607).

Feel free to take a look around on the stats pages and tell me what you see. This is all new stuff, and I will make changes to the model based on any discoveries people make. Personally, I’m just happy to see Johnny Estrada is due for a little rebound (PrOPS = 0.746 versus OPS = 0.632).

A Hardball Times Update
Goodbye for now.

Comments are closed.