Questions. They’re important. In fact, they’re fundamental to sabermetrics and baseball research. They drive analysis, create discussion, and make for entertaining writing.
I first recall seeing the subject come up in a piece by Dave Cameron last month, in which he claimed that “a good stat is simply the answer to a question that is commonly asked.”
A month later, I attended the SABR Analytics conference in Phoenix, and there, Bill James spoke about the importance of questions for sabermetrics, saying that “what’s critical is to be able to find a question that has an answer, and you don’t know what that answer is.” Following the conference, I picked up on what I’m sure has always been happening: writers laying out and sometimes answering interesting questions, such as Ben Lindbergh’s 15 Questions I’ve Been Asking Myself Since the SABR Conference and a recent article by Jeff Sullivan in which he led with the assertion that the “core purpose of FanGraphs” is to answer questions.
I love this. I studied philosophy in college. All I did was ask questions. Do we have free will? What is knowledge? Is time continuous? How should we make judgments about moral responsibility? Sure, it’s fun to argue about the answers to these questions, but the true genius philosophers are those who asked the questions in the first place. It’s impossible to make progress or have rational discussions if you don’t even know what you’re arguing about.
Ok, back to baseball. With this idea in mind, particularly Cameron’s claim that a good stat should answer a common question, I wondered what questions the most common traditional statistics or metrics are trying to answer. We’re all quick to dismiss pitcher wins and RBI as flawed and inferior to other stats, but before we can dismiss them, we need to know what question they are trying to answer. Once we have a question, we can figure out whether there is a better answer out there.
But finding a question for the answer that is the statistic is not as easy as it seems. As I see it, there are varying “levels” of questions that we can ask. First, there’s the “surface level” question. This is the question that the statistic literally answers, including all contributing factors and exceptions. The surface level question for fWAR, for example, might be, “What is the sum of the player’s batting, base-running, fielding, and positional runs above what a replacement level player would provide, adjusted to a win scale?” (or something of the sort).
Then, we have the “digging deeper” level question. This is still a question that the statistic actually answers, but it gets closer to the idea behind the stat. For fWAR, this might be something like “How many wins did the player contribute to his team above what a replacement level player would contribute?” fWAR still answers this question, but it more appropriately gets at the reason behind the number.
Finally, we have the fundamental question. This is really the core of the statistic—the answer that we all want to reach with the stat, even if the stat itself falls short of answering it. For WAR, the question might be, “How much context-neutral value did the player contribute to his team?” or, to be even broader, “How valuable was the player?”.
That fundamental question is, for obvious reasons, the most important one of all. It’s what we really want to know. It’s the reason we care about the statistic, even if it’s different than the question that the stat actually answers.
With that in mind, let’s go through some of the most commonly used traditional stats and see if we can determine the various levels of questions that they answer (or try to answer).
On the surface: “How often was the player credited with a hit for every plate appearance that didn’t result in a walk or hit-by-pitch or sacrifice?”
Pretty straightforward. This question essentially reiterates the equation for batting average, but it’s good to be able to see exactly what it measures.
Digging deeper: “How often did the player get a hit per opportunity to get a hit?”
I struggled with this question, because I wanted to find the right words to express the reason that batting average ignores walks and sacrifices. It seems like a somewhat arbitrary choice, but the idea is to not penalize a player for something that “helps the team.” If plate appearances was in the denominator of batting average instead of at-bats, players would get penalized for walking, which we probably don’t want.
Now, I realize that every plate appearance is an opportunity for a hit. So really, batting average doesn’t literally answer this second question. However, if you think of opportunity as instances in which a player could either get a hit or make an unproductive out and you squint your eyes real tight, you can almost see how the question fits. That being said, perhaps a better way to answer the question would just be to bite the bullet on penalizing walks and use H/PA instead.
The fundamental question: “How often did the hitter make good contact?”
You might think this is a reach. After all, batting average is simply a descriptive statistic at the face of it. It tells us the rate of hits—that’s it. Yet, the way that we use batting average is to describe players who did well at putting themselves in a position for a hit. It tries to measure how well a player makes good contact and gets on base via the batted ball.
I’m not sure that we have a good answer to the above question at the moment. We could use contact percentage, but that doesn’t take into account batted ball types. We could use line drive percentage, but that has classification flaws, as well as the fact that many ground balls and fly balls turn into hits as well.
Runs batted in
On the surface: “How many times did a runner score during the plays that resulted from the player’s plate appearance (excluding double plays)?”
Again, not much to add here. This is simply how we measure RBIs.
Digging deeper: “How many runs scored immediately because of the the player’s plate appearances?”
This is similar to the first question, but is worded so that the runs are attributed to the player rather than the play.
The fundamental question: “How many runs resulted from the player’s offensive performance?”
And here we have the real question that RBI is trying to answer. The goal of RBI, as I view it, isn’t just to measure runs that came immediately after the play—it’s to measure how many runs the player produced, how many runs the player was responsible for. This is a fantastic question, and one that absolutely should be answered. This question may have already answered by stats like runs created (RC) or RE24, though it’s up to you to decide if those metrics really answer the above question.
On-base plus slugging (OPS)
On the surface: “What is the sum of the player’s on-base percentage and slugging percentage?”
Digging deeper: “How often did the player get on base and how much power did he have?”
Yes, this question essentially just combines the questions for on-base percentage and slugging percentage. OPS tells us both about the player’s ability to get on base and ability to hit for power, so it’s natural that the question would ask about those two things.
The fundamental question: “How well did the player perform offensively on a rate basis?”
Now that is a question that is worth answering. Another way to put it might be “How good a hitter is this player?”. Luckily, this is a question that has been asked and answered many times in sabermetrics. In fact, the question has been answered so well that OPS really has no use anymore, other than for the reason that it combines two other common statistics. In that way, maybe the first surface-level question is the only real use of OPS. It tells us the answer to that perfectly, but if you want more, which you probably do, there are better places to look.
This wasn’t groundbreaking analysis. In fact, it wasn’t analysis at all—it was reflection. But reflection can be valuable. Reflection leads to interesting questions, such as the questions that I asked above, and interesting questions lead to interesting answers. It’s important to reflect on statistics that we use often, or not so often, to determine what we’re really trying to answer. Because once we understand the question, we can adjust the stat or find a new one.