# Ten things I didn’t know last week

###### The best sluggers of all time

A couple of years ago, I reviewed a book called Baseball’s All-Time Best Sluggers. Written by a professor of biostatistics named Michael Schell, it’s filled with a lot of very cool mathematical concepts and adjustments. This book isn’t for the math-faint-of-heart, but you can learn a ton from reading it—see my linked article for some examples.

I was reminded of Schell’s book when I read an online debate last winter about whether Jim Rice or Albert Belle was a better hitter. The comparison is an interesting one: two great hitters with less-than-stellar gloves in left field. But comparing hitters across eras (and baseball was very different in the 1970s than it was in the 1990s) is a tricky proposition, and OPS+ (available from Baseball Reference) is a crude tool for the task.

So I thought to myself, “Hey, they should read Schell’s book!” (I always yell at myself.) Although I don’t think that you can compare players across eras with 100 percent certainty, you should at least do the best job you can. And Schell’s approach is the best I’ve read. Unfortunately, his book wasn’t widely read, and his all-time ranking of baseball’s hitters wasn’t available anywhere else…

…until now. Michael Schell and his publisher (Princeton University Press) have given The Hardball Times permission to post his rankings on our site, and I decided to apply my developing PHP/MySQL skills to the task. So now you have it:

The database includes the 1,140 batters who had at least 4,000 plate appearances through 2003 (yes, the data is current only through 2003). Babe Ruth is first and Tommy Thevenow is last. I’ll go over a few more in this column. The rest are there for you to discover.

###### How Rice and Belle compare

So, how do Rice and Belle compare? Here is a reprint of the results you’ll find in our database; the stats represent a “seasonal line” for each batter, based on playing in a neutral park between 1977 and 1992 (the most stable era in baseball history, according to Schell). The stats also reflect each player’s longevity.

```Player       POS  Runs   HR  RBI   SB   BA  OBP  SLG  CBR Rank
Albert Belle OF     74   27   94   12 .286 .356 .522 23.5  126
Jim    Rice  OF     73   22   83    9 .288 .345 .475 20.9  216```

“CBR” stands for “Career Batter Rating,” and it represents the number of runs above average that a player would have generated—similar to Linear Weights and Batting Runs. The player’s rank is based on CBR, adjusted for position.

Belle has the better ranking, and by a decent margin. The two have similar batting averages, but Belle is 10 points better in OBP and almost 50 points better in SLG. On Schell’s list, Belle is sandwiched between Wally Berger and Ken Singleton. Rice is lower, between Sammy Sosa and second baseman/outfielder Danny Murphy.

###### Other pairings

Our John Brattain suggested comparing Rice to former Astro Jimmy Wynn (who was cursed with having to bat in the Astrodome). Here you go:

```Player        POS   Runs  HR  RBI   SB   BA  OBP  SLG  CBR Rank
Jimmy  Wynn   OF     80   25   70   30 .253 .372 .460 22.0  136
Jim    Rice   OF     73   22   83    9 .288 .345 .475 20.9  216```

Wynn was almost as fine a batter as Albert Belle. Wynn’s projected batting average is relatively low, but his OBP is significantly better than either Rice’s or Belle’s.

And Joe Dimino suggested…

```Player          POS   Runs  HR  RBI   SB   BA  OBP  SLG  CBR Rank
Albert Belle    OF     74   27   94   12 .286 .356 .522 23.5  126
Ralph  Kiner    OF     78   32   81    4 .269 .376 .512 24.5  134```

Joe thought that Belle and Ralph Kiner were similar types of batters, and he was right. As you can see, these two sluggers are only a run apart in CBR. Belle had the better BA, but Kiner walked more often and hit more home runs, making them fairly even in CBR.

Chris Jaffe also chimed in….

Roger Connor was a big first baseman for his time (his time being the 1880s and 1890s). In fact, Connor was one of the original “New York Giants,” and he held the record for most career home runs until Babe Ruth came along. Mark McGwire was also a big guy for his time. Chris wanted to know how the two compare in Schell’s system.

```Player         POS   Runs  HR  RBI   SB   BA  OBP  SLG  CBR Rank
Roger Connor   1B     77   21   76    5 .298 .394 .514 37.4   23
Mark  McGwire  1B     82   41   98    2 .263 .387 .556 37.2   40```

Connor wouldn’t be a big-time home run hitter, but he’d have 21 home runs with a .298/.394/.514 line. That’s nothing to sneeze at, and he’s rated the 23rd-best batter of all time. McGwire’s home run total would be second to only to the Babe’s.

###### Driving in runs doesn’t make you a great hitter

So, one of the fun things you can do with the database is rank the players by different stats. We all know that Runs Batted In, though a decent stat, can be misleading. I took a look at who would have had the highest RBI totals, and found a couple of striking anomalies:

```Player        POS   Runs  HR  RBI   SB   BA  OBP  SLG  CBR Rank
Dave Kingman  OF     67   33   91   12 .237 .306 .481 10.6  605
Dick Stuart   1B     62   27   90    0 .258 .314 .470  6.3  674```

Kingman’s RBI total is the 26th-best among Schell’s batters, and Stuart’s is the 30th-best, but these two (you could call them “Strange Glove” and “Strange Man”) are nowhere near the best hitters of all time. In fact, they don’t even crack the top half.

###### Who is Jack Smith?

So then I ranked players by Runs Scored and came across someone I had never noticed before.

```Player      POS  Runs   HR  RBI   SB   BA  OBP  SLG  CBR Rank
Jack Smith  OF     84   10   43   39 .265 .332 .382  0.9 1033```

Jack Smith, an outfielder for the Cardinals and Braves from 1915 to 1929, has a very strange profile. His projected Runs Scored total of 84 ranks 42nd among all batters, but as a hitter he’s only 1,033rd overall. Very strange.

If you compare a batter’s Runs Created total to his Runs Scored, usually you find a pretty good match. For instance, Rickey Henderson created 2,164 runs and scored 2,295. Leadoff men have an obvious advantage—Vince Coleman created 688 runs and scored 849, a pretty big difference. Jack Smith, though, created 595 runs but scored 783, a difference of 188 runs. That may be the biggest difference in baseball history.

The comparison of Smith and Coleman is apt. In Schell’s system, Smith steals 39 bases (and Coleman steals 73), but both have very low OBP and SLG. In fact, Coleman is worse on both counts and ranks even lower than Smith (1,119th overall). And, of course, both were Cardinals for most of their careers.

In Smith’s case, I’m thinking that batting leadoff in front of Rogers Hornsby helped a lot.

###### Empty OBPs

OBP is supposedly the best batting stat, right? A high OBP means fewer outs created, right? Well, there is such a thing as an “empty” OBP, at least on a relative basis.

```Name          POS  Runs   HR  RBI   SB   BA  OBP  SLG  CBR Rank
Roy  Thomas   OF     80    3   27   13 .279 .418 .338 12.5  413```

Roy Thomas has the ninth-best OBP in Schell’s system, but he’s only the 413th-best batter overall. A Philly outfielder from 1899 to 1911, the guy did nothing but hit singles and walk. If he played a full season, he walked over a hundred times, guaranteed. In fact, he led the league in walks seven times! He was the original “Walking Man.”

In fact, let’s compare Thomas with the guy whose nickname was “the Walking Man”:

```Name          POS  Runs   HR  RBI   SB   BA  OBP  SLG  CBR Rank
Eddie Yost    3B     71   10   38   13 .250 .371 .365   12  292
Roy   Thomas  OF     80    3   27   13 .279 .418 .338 12.5  413```

Well, lookie there. Thomas beats Eddie Yost in OBP and walk rate. Yost rates higher overall, because he could actually hit the ball kind of hard.

###### Chipper is third at third… at least

Remembering that these rankings include only seasons through 2003, let’s look at the top third basemen:

```Player           POS  Runs   HR  RBI   SB   BA  OBP  SLG  CBR Rank
Mike    Schmidt  3B     84   35   87   18 .267 .381 .530 39.1   15
Eddie   Mathews  3B     87   30   80   11 .269 .387 .505 36.2   17
George  Brett    3B     78   18   77   20 .306 .381 .501 33.6   30
Wade    Boggs    3B     77    5   47    3 .322 .416 .426 31.2   32
Chipper Jones    3B     80   21   76   15 .297 .391 .486 23.0   69```

Schmidt and Mathews have a decent lead over the other third basemen. But what about the guy who was already the fifth-ranked third baseman of all time in 2003? Well, Schell published a little update to his book that incorporated the 2004 season. In that chapter, he projected that Chipper Jones would finish around 22nd overall, making him the third-best third baseman of all time.

However, Jones continues to defy Father Time and, as I type, he’s batting .415/.475/.683. Is it possible that he will pass Schmidt and Mathews? Make a note: The best-hitting third baseman of all time might be playing in Atlanta right now.

###### How likely it is to catch two foul balls in a row

On to other baseballness. Did you catch the recent story about two guys, standing next to each other, catching two foul balls in a row? Really, what is the probability of that? Well, one of the foul-catchers (see how much of a difference one little hyphen can make?) said, “It’s got to be one in 10 million.” But the article quotes a math professor stating that the probability is more like one in 10,000.

Big diff. Can we do better?

Let’s play with some assumptions. Let’s say that there are 30 foul balls hit into the stands each game, that there are 300 pitches in a game, that 30,000 people attend a game, and that about 33 percent of them sit in an area of the stands with a decent chance of catching a foul ball. The first three figures are all pretty close to actual, and the last figure of 33 percent seems reasonable to me. And all the three’s look neat together.

So the probability of a fan (who sits in the eligible area) catching a foul ball at a game is 30/10,000, or 1 out of every 333 fans (assuming no one catches more than one), or 0.3 percent. The probability of a specific person catching two foul balls at the same game is 0.3 percent times 0.3 percent, or 0.0009 percent—one out of 111,111 times that person goes to a ballgame (assuming he sits in the eligible area each time).

Since 10 percent of pitches are foul balls into the stands, the probability of a specific (eligible) person catching a foul ball on a specific pitch is basically .1/10,000, or one in 100,000 fan/pitch instances. If you square that, the probability of a specific fan catching a foul ball on two specific pitches is one in 10 billion.

Now, we’re talking about two fans next to each other, and I’m not quite sure how to handle that math. But, to keep it simple, I’ll cut the fan base in half, or .1/5,000 squared, which is one in 2.5 billion.

Wow. That’s a huge number. The obvious problem here is that some parts of the stands are MUCH more likely to have a foul ball hit to them than others. The tighter you make the foul area, the higher the probability—for instance, if you say that 20 fouls balls per game go to areas in which only 10 percent of fans might catch them, you get a probability of “only” one in 506 million.

I got the opinion of my favorite LA Dodger fan (who also happens to be an econometrician, which is sort of like a mathematical economist). Here’s his response.

But wait, there’s more. Was the game in question a day game on Wednesday against the Mets? Was the pitcher John Maine and the batter James Loney? If so, then I was there, and I was in the section into which the balls were being hit (Loge third-base side). And here’s what you’re missing: Maine’s fastball that day was much quicker than usual, and the Dodger hitters were having trouble getting around on it. Loney is terrific at fouling off balls, and he’s a lefty, so most of his fouls head toward the third-base side. During that at-bat, Loney fouled something like four or five more or less consecutive shots into the stands within 50 feet of my seats!

So the real probability is much, much lower than one in 506 million. How low is it? Dunno. All I know is that you’ve got to be careful about applying general probability models to specific instances.

###### Fangraphs is king

Fangraphs now has WPA graphs and play logs of every game from 1974 through 1988. Seriously, how amazing is that? I can get lost in those graphs. Red Sox and Yankee fans probably remember this game:

To even out the karma, here’s a classic game from 1984. The Red Sox were losing 6-3 in the ninth, but run-scoring singles by Jim Rice and Bill Buckner, followed by a three-run home run by Reid Nichols, drove the Sox to a 9-6 victory.

Check out the game graphs yourself. It’s more fun than a barrel of Marcels.