In March, I published a Hardball Times article in which I showed the increasing importance of opportunity and income in making it to The Show. However, I did not have data on the actual race of players, so some of my claims relied on guesswork and inference. Fortunately, Mark Armour has been gracious enough to share a data set that he and Dan Levitt created for their fantastic study documenting the decline of black players, enabling me to look more closely at the link between opportunity and race. These data strengthen the case I made in March about opportunity while also calling into question several other theories about the source of racial trends in baseball.
In March, I produced this table showing the relative share of baseball players’ WAR to births in each region by decade:
|WAR/births by decade, region|
In this graph, the South Region in the 1940s had a WAR/birth ratio of 1.00. This means that the share of WAR produced by players born in this region among all U.S.-born players between 1940 and 1949 was equal to the share of births in these states during the same decade. Values above 1.00 indicate that the share of WAR from players born in this region was greater than the share of births, while the opposite is true for values below 1.00.
In this table, it is clear that a greater share of players came from the South over time. Since there are more African-Americans in the South than any other region, we may have expected there to be more African-American players over time, but instead the opposite is true.
I pointed out that this may not be so surprising when you consider the following map produced by David Leonhardt at The New York Times, based on research by Raj Chetty, Nathaniel Hendren, Emmanuel Saez, and Nicholas Turner. This map shows economic mobility by region within the United States, where red areas represent the least upward mobility:
I further studied the data on WAR by county and determined which factors corresponded to producing more WAR per birth over time. These were the three main takeaways from this study:
- Higher income has become more important over time
- Warmer weather has become more important over time
- Warmer weather is more important in high-income counties (and vice versa)
Now that I have specific data on players’ races, I can look at these results a little more thoroughly.
The primary source of data in this series of articles is the Armour and Levitt proprietary data set that classifies the race of all major league players who played from 1947 through 2012 into white, African-American, Latino and Asian. Since I used all data through 2013 and all players born from 1940 to 1989 in my study, I needed to classify a few 2013 rookies. This is surprisingly more difficult than you would expect, which has made me even more impressed with Armour and Levitt’s detailed work.
Of course, this provided only the numerator, and I needed the denominator, too. So I used state data on births by race (or people under the age of one) from a couple of different sources. The National Cancer Institute’s “Surveillance, Epidemiology, and End Results Program” has figures only since 1969, which it gathers from census data, as well as supplementary data where needed.
The National Center for Health Statistics provides data on births by state and race up until 1968. Unfortunately, the NCHS data use a somewhat different definition of black than the SEER data, which would have led to a bias. So I assumed that the share of black births in 1968 and 1969 was the same in each state and adjusted the 1940-68 data down (or up in some cases) by that constant. While this is not a perfect solution, the share of births in a given state wouldn’t change so much that it would have much of an effect. The changes in the share of black players are so large by decade that a slight error in the denominator would not have masked those changes.
One other limitation of this data was that I was unable to look at births by race and by county, so there is still some guesswork required at that level. However, I was convinced by the data that the issue of opportunity and income (especially the importance of weather in creating opportunity for wealthier families) is the dominating factor in the decline in African-Americans in baseball over the last few decades.
Variable Studied: Total Black Players, Population-based Expected Total Black Players
In March, the variable that I used was WAR per birth, with individual player WAR capped at 20. Some people were concerned that this might not capture the true trends, but fortunately, using the number of black players relative to all American players had the same results as looking at the share of black players’ WAR relative to all American players’ WAR.
However, this study will use a simpler variable, which is the number of African-American players that were born in a given state or region relative to the expected number of black players born in a given state or region. The “Expected blacks,” the number of black players we would expect to be born, is simply the number of players born in a state or region in a given decade multiplied by the share of black children born among all children born in that region at the same time.
For example, among the 218 players born in the South Atlantic Division between 1980 and 1989, we would have expected 59 players to be black because 27 percent of people born in the South Atlantic Division in 1980-89 were black. However, there were only 29 black players born in the South Atlantic Division during those years. A disproportionately small share of players from that region in that time period was black.
There are nine Census Divisions, and the following table shows the expected number of black players in a Census Division each decade (based on the total number of players from that Division and the share of black children in that Division). The “E(B) 40s” column, for example, shows that we would have expected 39 players from the South Atlantic Division to be black among all players born in the 1940s, but the “B 40s” shows that only 25 were.
|Total Black Players Relative to Share of Regional Population|
|Census Division (States)||E(B) 40s||B 40s||E(B) 50s||B 50s||E(B) 60s||B 60s||E(B) 70s||B 70s||E(B) 80s||B 80s|
|New England (ME, NH, VT, MA,RI, CT)||1||1||1||1||2||3||2||2||2||1|
|Middle Atlantic (PA, NJ, NY)||8||19||14||12||24||22||22||12||11||4|
|East North Central (WI, MI, OH, IL, IN)||10||15||18||24||25||30||27||18||15||5|
|West North Central (ND, MN, SD, IA, NE, KS, MO)||2||7||3||8||4||7||5||8||4||3|
|South Atlantic (DE, MD, DC, WV, VA, NC, SC, GA, FL)||39||25||43||38||58||75||62||47||59||29|
|East South Central (KY, TN, AL, MS)||18||26||22||34||21||24||27||28||18||15|
|West South Central (TX, OK, AR, LA)||20||39||27||41||32||33||30||19||27||12|
|Mountain (ID, MT, WY, CO, UT, NV, AZ, NM)||1||1||3||1||4||2||4||2|
|Pacific (AK, HI, WA, OR, CA)||7||31||21||69||26||64||28||45||20||25|
The first thing that should jump out when you look at this graph is that the decline in black players has occurred everywhere. Overall, based on the share of the population born in each Census Division in the 1960s that made the major leagues, we would have expected 193 black players, but instead 262 black players born in the 1960s made The Show. In each Census Division, the share of black players relative to the population size was roughly equal to what we would have expected, except in the South Atlantic Division and the Pacific Division, where there were considerably more black players than we would expect base on the population.
By the 1980s, the opposite was true. We would have expected 157 black players based on the share of population in each Census Division, but there were only 94. Furthermore, there were fewer black players than we would expect in every region except the Pacific Division, but even in that Division, the share of black players was a far cry from where it had been in the 1960s. In the South Atlantic, there were less than half as many black players as you would expect based on the population.
Although the change basically occurred everywhere, the biggest impacts were in the Pacific and South Atlantic. However, in percentage terms, the decline in the East North Central and West South Central Census Divisions was more pronounced.
Although this table of Census Divisions provides a pretty good picture of nationwide decline in black players over the last couple decades, looking at things at the state and county level is illuminating, as well. Doing so will further strengthen the case that weather and income have become increasingly important in developing baseball players.
I created a similar table to the Census Divisions table but broken down by individual state, which you can view here. The results are difficult to analyze for some smaller states with few players or few black children born, but in states with more players/larger shares of black children born, you can see the effect take hold.
The decline in black players is largely composed of two coinciding trends. One is that some states with disproportionate shares of black children born saw declining shares of players overall. There are also states where players came from more over time, but that a large share of those players were white.
If the overall share of the major league U.S.-born population that was black was the same in the 1980s as it had been in the 1940s-1970s, there would have been 106 more black players born in the 1980s who played in the majors. The biggest drop in expected player turnout to actual player turnout was in California, which produced 24 fewer players in the 1980s than their production from the ’40s-’70s would have predicted. In addition, six other states–-Alabama, Florida, Ohio, Louisiana, Georgia and Texas–-all produced at least five fewer players in the 1980s than would have been expected.
Some of these states have interesting breakouts at the county level, as well. For instance, the share of players from California by decade has remained roughly constant, but players have come from Southern California more over time, and fewer players have come from the Bay Area, where the share of black children born was relatively higher.
Orange County produced only two players born in the 1940s but has had 68 players born in the 1970s and 1980s, and it is one of the wealthiest and most white areas in the country. Only one of the 103 players born in Orange County between 1940 and 1989 was black.
Historically, many California-born players have come from the Bay Area and Los Angeles County, where there are large shares of black residents, but these have given way to wealthier counties and (relative to the Bay Area) mostly warmer counties. The steadily rising share of players from Orange County over time is symbolic of the trend within California and in the nation overall, as many of the most talented and wealthiest youth players have had the opportunity to hone their skills year-round.
We see the same thing in Georgia, whose contribution to major league baseball has nearly quadrupled from just one percent of all players born in the 1940s to 3.8 percent born in the 1980s. The share of black players has fallen over the last couple decades, however. In the 1960s, 13 of 32 Georgia-born players were black, while in the 1980s, only four of 38 were. This was because of increases in wealthier counties, where disproportionate shares of residents were white.
The research from Chetty et al. on economic mobility showed that the very-sprawled Atlanta region had all the ingredients of declining opportunity for the poor. This seems to have manifested in baseball, as well. Forsyth, Walton, Douglas, Fulton and Gwinnett Counties have produced more players over time, contributing to the growing share of Georgia players in the league. However, this opportunity has not expanded to other regions. Clayton County is 66 percent black and is the fifth largest in population among all 39 counties in the Metro Atlanta region, but has not produced any black players.
It’s not just those two states. Hamilton County in Ohio–the county in which Cincinnati is located–produced only one player who was born in the 1980s after it had produced 15 in the four decades prior. The same is true in New York, where Queens and the Bronx have seen sharp drops in the number of black players produced despite being two of the counties with the highest share of black residents.
Across geographic regions in the country, these patterns are consistent with the broader findings that weather and income have become more important over time for the development of baseball players. However, there probably still are many people who are skeptical and believe that black interest in baseball has simply waned over time due to fewer black superstar players that younger children could root for. In tomorrow’s article, I will exploit some of the excellent geographical data to show why I do not believe this theory holds.
References & Resources
- Mark Armour and Dan Levitt dataset on player race used in their article
- J.C. Bradbury’s study on the decline of Black players, including evidence that the NBA and NFL have not seen coinciding increases in Black players
- NCI – SEER (National Cancer Institute – Surveillance, Epidemiology, and End Results Program) for 1969+ data on race
- National Center for Health Statistics (1940-1968)
- My article on income and weather
- Map from New York Times article
- Small Area Income and Poverty Estimates for county-level income data from 2011
- National Oceanic and Atmospheric Administration for average temperatures by state
- Team loyalty lists come from this map
- The article on ages of fans is here