Recently I had the honor and pleasure of picking the brains of Sean Forman, the man behind the best site on the internet, Baseball-Reference. So here are his thoughts about himself, his website, and getting into bar fights with firefighters.
Tell us all a little about your background—age, family, where you’re from, et al.
I grew up in Iowa. My dad is a high school football coach, so I’ve always been immersed in sports stats and had an interest in fantasy baseball early on. I’m 35 and have been married to a wonderful woman for eight years. We have one son, Carl, who is almost two. He is not named after Yaz or Carl Gauss.
What did you do before becoming full-time b-ref guy?
I was a math and computer science (mainly applied math type courses) professor at Saint Joseph’s University for six years. Prior to that I was in school for 21 years.
Apparently you like math. Why?
I’m not sure how to answer that one. I suppose it has something to do with creating order and classifications for what is happening in the world. It has always been something that has come pretty easily for me.
How did you first become a baseball fan?
I have always enjoyed the game and watched and played the game at a very early age. I suppose it was the sport where my general lack of speed and strength would matter the least. Back when I was growing up I would watch football and baseball. Basketball was never really on and hockey was non-existent in the Midwest then.
Did you play at all as a kid? Where you any good?
I grew up in a small town in Iowa with a graduating class of 33, so you have to put everything in that perspective. Our high school varsity baseball team probably was no better than a good Miami-area junior high program, and no one has ever been drafted from our county that I’m aware of. That said, I was MVP of the team my senior year and batted over .400. I played catcher and third base. I also played football, basketball and golf. I played golf at the D-III level at Grinnell.
Growing up Boggs and Henderson were my favorites. I really focused on OBP myself. I’m a Red Sox fan due to park effects. Every Sunday, I’d look over the batting and pitching leaders and Lynn, Rice, Dewey, Boggs, etc. were always at the top. I just couldn’t figure out how they had such a potent offense, but could never get their pitching straightened out.
What was your introduction into sabermetrics? Why did it appeal to you?
I actually came at it through fantasy baseball. I was a very dedicated fantasy baseball player, and was in three or four very serious leagues at my peak. One league was run by Rick Wilton (now at Baseball HQ) and we actually had an entrance interview to get in. We drafted 40 players down to the minors, so I started analyzing the minors closely, gathering data, etc. That led me to post my ratings on rec.sport.baseball. I spent a lot of time there. This was around the genesis of Baseball Prospectus, which was started by rsbb alumni. I’m actually one of just a few hundred people with every edition of the Prospectus. I then started picking up the Abstracts on Ebay, though there is probably a lot in there I haven’t read.
That led to a writing stint with the Big Bad Baseball Annual and some other things like BaseballThinkFactory.org. Jim Furtado and I co-founded that site in 2000, I think. I handled a lot of the technical and design issues, though Jim did a lot of that as well.
Why did you and Jim Furtado break off your partnership?
It was completely cordial. I was running B-R at the same time and doing a full-time job, so I decided something had to go. I essentially walked away to focus more on B-R. We still talk on the phone quite a bit.
What inspired you to start b-ref? What made you think you could pull it off?
I was really just scratching my own itch. TotalBaseball had a site, but it was not very usable. I was a big reader web design back then. The Lahman Database provided all of the data one would need, and I knew the best way to do it would be to make static pages so they would be fast and indexed by search engines. The basic data in 2000 came from the Lahman Database which has its genesis in work done by Pete Palmer. Retrosheet obviously provides a lot of the more detailed data we’ve added. Some data has been donated, some I’ve entered and more recently, we’ve invested in the site by buying data we’ve wanted to add. Daily updates are performed using data we buy from a third party.
Once I found a server company that offered 300MB of space for $20/month, I figured I would give it a shot. When I started it, we were right at the peak of the internet bubble. In January 2000, there were a lot of hard-to-use websites out there. I figured a low key highly usable website would work best.
It launched as part of BBBA in Feb. 2000 and then launched as Baseball-Reference.com on April 1, 2000. Traffic was respectable. In April of 2000, we had 152,835 page views. That is pretty close to what we get from 12-4pm on a normal day now. We’ve grown pretty gradually over time, but we now have 500,000 page views a day on a normal day.
Why did you choose the name baseball-reference?
Well, it was a descriptive name and the other ones were taken. In retrospect, something a lot more brandable would have been a better choice, but we are kind of stuck with it now.
There’s currently b-ref, Justin Kubatko’s basketball-ref, and Doug Drinen’s profootball-ref. Plus there’s a sports reference network linking all you guys together. What exactly is the current relationship between these sites? Is sports-ref a parent company for all of you or are you independent or what?
We are pretty close to forming a single company under the sports reference name. Sorting out all of the issues is taking a lot longer than it seems like it should.
What’s the top complaint you get?
Add uniform numbers or player pictures. I suspect most of the people with complaints, just go somewhere else.
When did you begin working baseball-reference.com as a full-time job?
I started in May of 2006. My family has been supportive. My wife has been wonderful and I’ve been blessed that she is as patient as she is. I think my parents are pretty much dumbfounded that I can make a living doing this, but they are happy for me.
Not starting the site three years earlier. I shudder to think how much the site might have sold for in 1999 at the peak of the bubble.
What’s your personal favorite feature/aspect of b-ref?
I’m very happy with how the play index features have turned out. I really like all of the little red text tooltips where users can get additional info.
What changes have you been able to make to the website over since making it your job?
Basically the play-by-play data was only doable because I was working full-time. That was essentially the first six months of time after starting full-time and then the in-season updates and related materials has been the second six months.
Has there been any increases in traffic this year compared to last season?
There has been a good increase. We are up about 25% or so.
Is there any part of the website or feature you provide you think/thought would get more traffic than it currently does?
The previews haven’t taken off as well as I thought they might. They are probably just too densely packed with info or with the wrong info for a general audience.
How many pages currently are sponsored? Do you sponsor any pages?
I don’t have the total number, but it is substantial. There are a few writers who do sponsor pages on the site. A couple of pages are sponsored in honor of Doug Pappas. My wife always like Ugueth Urbina because of his initials, so I sponsored that page for her, but that didn’t turn out too well.
How many PI subscriptions do you currently have out?
It has been slow going, but we have passed 600 so far.
Any changes you plan on making in the near future?
I think we’ll have a major amount of minor league data out in a few weeks. We’ll be upgrading the football site significantly as well, and post-season data will be added to the Play Index as well, so you’ll be able to search for the best postseason starts or the post-season walk-off home runs, or another other manner of post-season info. I’m also hoping to do a major upgrade to the team and player pages before the season ends.
How do you choose what stats to put on your site? How do you walk the line between popular accessibility and your sabermetric background?
I probably err on the side of accessibility. The audience for advanced sabermetrics is influential and growing, but it is definitely limited. I try to add the stats that are going to give people the most information to win their arguments.
Is there anything you’d really like to do right now but can’t?
I’d love to have pictures on the site, more biographical info, and more interactive user features, but I’m limited as to what I can do with site by time, money and copyright laws.
Your site updates every morning. Do you have to do any manual clicking to make that happen or is it somehow automated?
On most days, it is completely automated. Occasionally, there is a crazy play that gums up the works and requires some manipulation or I break something unintentionally and don’t realize it until the next morning. I’m kind of lazy and can’t stand doing repetitive work, so I try to take as much of that out of the maintenance of the site as possible. It takes a couple hours for everything to get updated.
Do you do any other sabermetric type stuff now, or is it just the website?
I did a study last year on catchers and missed pitches, but typically, I’m just doing one-off things like that.
It’s a great site and you do a fantastic job. Do you ever get sick of people telling you it’s a great site and you do a fantastic job?
I certainly appreciate all of the feedback that I get. There are so many great stats sites out there like Retrosheet with some of their performance and leaders features, The Baseball Cube with their new historical minor league data, and Baseball-Almanac.com with their lists and uniform data, and then the Big Guys like MLB.com and ESPN. I’m just trying mostly to keep up with everyone else.
Early August 2008: how should b-ref look differently than it does now?
I think there will be more integration with the play-by-play database and more customizable features for users. More data as well.
If Bud Selig calls up tomorrow and says he wants to buy the site, name your price – what do you say?
More seriously, it’s certainly possible that we could sell at some point if the right deal came along. We aren’t looking to sell, and I’m very excited about what we are going to add over the next year. We’re up to two employees (myself and Justin), and I think the sites are only going to get better as we spend more and more time on them. So I’m not in any great hurry to cash out, and besides, I think we can build a solid business out of this.
By the way, if someone out there wants to make some money doing LAMP admin work a few hours a month (with an emphasis on the A and M), I’d love to talk to you.
Already, now for the stupid stuff: Favorite TV show – all-time and current?
The Office is the only thing I have to see each week, and I enjoy 30 Rock as well. As a young kid I loved The A Team and Dukes of Hazzard. Oh and Hogan’s Heroes as well.
Favorite band & song?
I’m a huge Greg Brown fan. He is an Iowan as well, and I probably own all of his CD’s and have seen him in concert 7 or 8 times. His album In the Hills of California is probably my favorite, and I particularly enjoy InaBelle Sale, which is just viciously funny. I’m sure I’ll be killed for this on primer with the music aficionados over there, but I enjoy female pop acts like Aimee Mann, Fiona Apple, Jill Sobule, P.J. Harvey, Cowboy Junkies.
Favorite dwarf from the Disney movie. Why that dwarf?
Dustin Pedroia, He has the best OBP of any of them.
Favorite power tool?
I haven’t had time to do any woodworking lately, but probably my table saw.
Last good book you read?
I don’t read fiction generally. I read The Bronx is Burning about a month ago.
What do you like on your pizza?
Half Fresh Tomatoes and Artichokes and Half Pepperoni.
Lawn care avoidance. I play a little squash when I get a chance.
If you were a Simpsons character, which one would you be? Why that one?
Prof. Frink. You’ve always got to carry the one.
You and Jim Furtado get into a bar fight – who wins?
Furtado in a walk. He’s trained to carry people out of burning buildings. I sit on my butt all day.