The biggest error in my book

Nuts.

I knew this was going to happen. It’s unavoidable in a way, but that doesn’t mean it bugs me any less. OK – so what in Hades am I talking about?

Well, as some of you out there in reader-land hopefully know, I recently wrote a book that I am quite proud of, Evaluating Baseball’s Managers, 1876-2008. (Buy a copy and find out why Craig Wright of The Diamond Appraise said it’s “A wonderful work of analytical and historical research.”)

Recently I did a little bit of digging through Retrosheet and uncovered a mistake in the book. That in and of itself isn’t too terrible. Heck at my (utterly dormant) blog for the book I uncovered a slew of errors in it when it first came out. This is different, though. This isn’t just an error, it’s an error that actually screws up a point I made in the book.

Errors: acceptable (to an extent) and otherwise

From my point of view, not all errors are created equal. At the bottom, you have grammatical errors. These are bad and I don’t mean to dismiss them, but at least you can normally figure out what they mean. For example, on page 226, I wrote: “Team owner Wayne Huzienga, upset that local taxpayers refused to grant him he a new stadium deal” Um, . .. “him he” – that obviously isn’t right. Then again, you can figure it out.

Beyond that, you have some factual errors. This obviously matters as saying something that ain’t right – well that isn’t a very good practice. This, however, doesn’t bother me quite as much as one might suspect. It’s not that I’m pro-error or anything like that. Instead, it relates to why the info is in the book in the first place. In the large majority of instances, I’m not just throwing out facts just for the sake of throwing out facts. I’m usually using the facts to make a larger point.

Let me explain with an example. Randomly flipping open my book for a second – on page 197 I note that Earl Weaver’s Orioles drew 8,131 walks from 1969 and no other AL team had more than 7,500 in that period. Suppose I messed up my math and the Orioles only drew 8,031 walks in that period. Well, . . OK. I can live with that – as long as the main point holds true. That’s the key: As long as the actual point I’m making isn’t endangered, I can live with things.

Which leads me to the problem I recently noted. I made a boo-boo. On the face of it, my mistake was rather small and insignificant – except that it had implications, and implications off its implications. And as these things snowball, it does in fact jeopardize a point I made in the book. Somewhere along the line it crossed over from merely annoying and vexing to downright botched and mortifying.

Ultimately, I think I can reconcile what’s in the book with the actual facts, but to do that I have to first explore what went wrong in the book.

Where it went wrong: A factual error and John McNamara

The John McNamara commentary is one of the shorter ones in the book. I make two main points about him, 1) he had pretty bad bullpens, and 2) his record in extra-inning contests was horrible. Specifically, his W-L record in overtime was 94-139, which works out to a dreadful .403 winning percentage, lowest of any manager in the book for who I had this info.

Well, it would be the lowest record in the book – if it was accurate. Not quite. I recently reckoned that his record was actually 104-133 in regular-season extra-inning games. That .439 winning percentage still stinks but not as badly as I said in my book.

Before looking at the broader problems, let’s just pause here for a second: How did I make this mistake and what makes me think I got it right now?

In my book, when compiling extra-inning W-L records, I used Baseball-Reference.com’s expanded standings, which includes those records. Then I just added them from there. With McNamara, it was a little trickier because he worked partial seasons six different times. On those occasions, I had to check the appropriate parts of the seasons and check W-L records in them alone. My hunch is that the problem emerged here, especially with the 1996 Angels where he worked two non-consecutive stretches in the season.

At any rate, some time last year I dumped every gamelog from Retrosheet into Excel. This wasn’t for the managers book, but because it could come in handy in a host of various projects (including my recent series on doubleheaders and my SABR presentation not so long ago). Later on, I finally put managers to every game, and recently I finally bothered to check out W-L records for all managers. Upshot: If there’s a difference between my current numbers and those in the book, I trust my current numbers.

Getting back to the point, the numbers in the book for McNamara’s W-L record are wrong. So far, this is something I can live with. After all, his record still stank.

The error starts snowballing

However, I wasn’t just throwing out his extra-inning record just to fill space. I wanted to make a point. I combined that fact with the other main point made in his commentary: His bullpens sucked. It makes sense that a manager with bad bullpens would do badly beyond the ninth inning, as teams rely on their relievers in almost all of those contests.

A Hardball Times Update
Goodbye for now.

The good news (for me) is that the updated extra-inning W-L record doesn’t seriously affect that point. His overtime record, while not as heinous as believed, was still pretty bad, and it still stands to reason that a guy with bad bullpens would do poorly in these games.

So far, this is annoying, but only annoying. Certainly nothing worth wasting an entire column on. But we’re not done yet.

You see, I don’t just point out that McNamara had a bad record in the extra-inning games. I note that my (inaccurate) records gave him the worst record of any manager in the book for whom I had extra-inning info. In fact, he didn’t. He was “only” second worst.

Well, heck, that’s not a big deal. His record in extra-inning games still stinks and his bullpens were bad and it still makes sense to argue a connection between them. So he’s second worst. Everything’s fine, right?

Normally, that would be the end of it. Except there’s one little detail that makes things blow up in my face. You know how McNamara had the second-worst record of anyone I could find? Well, the guy with the worst record in extra-inning games is the manager I’d least expect, and least like to be slotted last. In fact, his coming in last potentially screws up everything I said about him in my commentary about him in the book.

It’s Jimy Williams, formerly of the Blue Jays, Red Sox and Astros. He went 62-83 (.428) in those games.

The REAL problem: The error and Jimy Williams

This is the part that stinks.

If anyone else had the worst record in extra-inning games, I could probably just shrug it off, but with Williams it snowballs into a serious issue I have to contend with. This completely screws up my main point on Williams.

You see, the most striking aspect of his teams were their bullpens. Williams had the best bullpens of any manager in baseball history who lasted at least 10 years. And it’s not even close. He got far more out of them than any manager in history.

I use something called the Tendencies Database in the book to look at team attributes of various managers across all eras. (You can find out how it works here. For now I’ll just say it compares various team attributes for different managers in a way that evens out differences between eras. I only apply it to managers who I have at least ten years of info. Now, to get back to my point… .) Among all managers with at least 10 years of bullpen info, Williams blows away the field. Want to look at ERA+? His relievers are top of the heap, compared to their peers and adjusted across all time. Williams is the king of bullpens.

Now, let’s think about this for a second. The whole point of bringing up extra-inning W-L records with McNamara was to point out that it made sense that a manager with exceptionally bad bullpens would have a correspondingly uninspiring record in extra-inning games. In many ways, that was the whole point of the McNamara section.

Now I have a guy with the best bullpens and the worst W-L records. This isn’t good. Something has to give here. If there’s nothing to my assumed correlation between W-L records and bullpen, the entire John McNamara commentary falls apart. If there is, I really need to explain the Jimy Williams info – and nowhere in the book do I do that. One way or another, the commentary on an entire manager is fundamentally flawed.

Is there any way to reconcile this?

Making sense of this mess

Mercifully, I think the circle can be squared. Some adjustments need to be made, but there’s no need to completely rip out any pages in the book.

Let me start with one point: Jimy Williams or no Jimy Williams, I do think there should be some correlation between quality of a bullpen and record in extra-inning games. Even back in the 1970s when complete games were still pretty common, extra-inning contests routinely featured relievers squaring off against each other not starters. Nowadays, obviously, they almost always pitch those emergency innings. There’s more to winning overtime games than the bullpen (there’s offense and luck, among other items), but relievers are a large chunk of it.

Thus I have no problem with the John McNamara commentary. OK, fine – the numbers are off by a tad on his W-L record in extra-length games. At least the point holds true. Besides, some non-Jimy Williams managers with really good bullpens had very good records in extra-inning games (most notably Ron Gardenhire, who I note in passing has an excellent record in extra-inning games, and then spend much of the section on him praising him bullpens).

That just leaves Jimy Williams, sticking out like a sore thumb. He’s the circle that needs squaring.

There is a way to do that. Let’s go back to what I said earlier: He’s king of the bullpens in the Tendencies Database. With the Tendencies Database, there are two main aspects of the bullpen I looked at. One was adjusted ERA, which as I already noted Williams scored first in with his relievers.

The second element focused on innings: Which teams had the largest percentage of their team’s innings eaten up by their relievers? In other words, if ERA looked at quality of bullpen usage, this looked at quantity.

Well, Williams loved using his bullpen. Twice his ‘pens ate up the largest percentage of the staff’s overall innings of any team in the league. Four times they came in second. Please note, Williams only served 11 years as a team’s primary manager – all in leagues with at least 14 teams. Yet his bullpens ranked in the top two a half-dozen times and among the top four squads nine times.

That isn’t just impressive, that’s the most extreme of any manager I could find. In other words, Williams wasn’t just the quality king with his bullpens, but the quantity king as well. It’s the combination that makes him the true bullpen king: No one else even remotely approaches his combination of reliance upon relievers and high quality work from them.

And, paradoxical as it might sound, I think the above two paragraphs help explain his poor W-L record. He relied more on his relievers than any of his peers – even before regulation ended. If anyone was likely to have used up his best arms before the 10th inning, it was Williams. This was a person who began managing more than 20 years ago, when complete games were still relatively common. He was one of the first managers to not only use but also rely heavily upon superlative middle relievers (such as Mark Eichhorn and Duane Ward) to win ballgames.

My hunch is that’s there’s also a bit of flukish bad luck explaining Williams’ extra-inning record as well. A few more wins, and he’s just forgettably in the pack of W-L records, albeit toward the end of the pack.

The good news for me is that I can makes sense of the corrected info. The McNamara main point stands (if not all the numbers). The Williams main point, needs to be modified, but I do believe that having better bullpens normally leads to good records in extra-inning games.

That said, I certainly wish I’d noticed all this 18 months ago, before I’d submitted the manuscript. It would be so much better if I explained the problem in the Jimy Williams section of my book, rather than here.

Lastly, and most importantly, I do wish to apologize to anyone who bought it. Mistakes happen, but as noted – this is the sort of one that really aggravates me.

References & Resources
Retrosheet came in handy for this research.


4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Matt
13 years ago

What I’m confused about is how you didn’t notice that Jimmy Williams had the 2nd worst (pre-corrections) W-L Ratio in extra innings games. 
With the trouble you run into with him, I don’t really see it mattering if he has the worst or 2nd worst record, the trouble still remains: Good bullpen, bad extra inning record.

Steve
13 years ago

As a purchaser of the book, no apologies necessary, Chris.  It’s very well written and cogently argued; given how many numbers you crunched, it would be surprising if there were NOT a few errata to correct from time to time.

That said, there is one thing you might want to do to make some of your arguments (like this one about the correlation between skill managing bullpens and extra-inning W/L pctg) more persuasive:  If there’s a relationship, test it statistically (via a crude correlation coefficient, or, better a multiple regression, which might allow you to see if, e.g., some of Jimy’s deficiencies were so strong that they more than offset his strength with ‘pen mgt on this score).

Even if the test supports your hypothesis, that doesn’t mean you’ve proven something, but it’s evidence in that direction, and it means you DON’T have to explain every random deviation from that rel’p.  And if your test doesn’t support your hypothesis, that doesn’t DISprove it (a point many sabermetricians often don’t appreciate) – just means you haven’t found evidence of support.  And, of course, form your hypotheses before you run your tests – like you did with the big “Birnbaum Database” analysis – so it doesn’t look like you’re developing a new theory to suit every little data point you run across.

Chris J.
13 years ago

Matt,

That’s a good point, and I’m wondering how I missed Williams myself.  My best hunch: it got lost in the partial seasons.  When I checked this originally, I had full sesons, and then went back and modified some of the leaders’ partial seasons to see the exact numbers.  That said, Williams should’ve been among the worst regardless, and I missed that.

The whole extra-inning records checking was among the most annoying and troubling (perhaps the most) part of the book, in part because of fixing for partial seasons, and in part because my computer crashed twice when doing it (fun!).  The whole process was sloppier than it should’ve been, so I can’t say I’m too surprised that my biggest error in the book centered on it.

Steve,

Glad you liked it.  While it might help to do a crude correlation coefficient as you note, I have no idea how to do that.  Phil Birnbaum did the heavy math lifting.  And as noted in my response to Matt, I only had full-season extra-inning records for most guys.

That said, I still think I’d have to explain the Williams situation even if a correlation showed that that my thought made since.  Or, to change the verb around, I’d still want to understand why Williams was such an outlier, and the explanation given in the article is ultimately just an attempt by me to understand what’s going on.

Mike Emeigh
13 years ago

Williams’s EI record, by team:

Toronto 27-29
Boston 27-33
Houston 8-21

Part of the issue is that Williams, because of the way he used his pens, didn’t always have his better relievers available for extra innings, so he had to turn to his lesser arms. For example, in Houston the guy who pitched most often in extra innings was Ricky Stone, probably no better than the third- or fourth-best reliever in the bullpen. In Boston, it was John Wasdin.