MLB Umps Not so Color Blind?
The racial bias analysis of strikes-balls decisions of MLB umps by UT-Austin professor Daniel Hamermesh and coauthors (Parsons, Sulaeman, Yates) has generated the expected buzz in the popular press and on baseball blogs (Google Search results; Time article). See Hamermesh’s post on Freakonomics and link to AER article).
As one would expect given Hamermesh’s past reputation and standing, this is not a lightweight effort. It’s creative and utilizes data that many sports researchers would covet. Valuable research is as much about the questions and attempts to answer them as it is about the answers. This piece has and will generate follow-on work.
The authors make some substantial, albeit qualified, claims:
The results allow us to think about the deeper question of measuring discrimination generally. If, as we show here, the match to the race/ethnicity of their evaluator affects evaluations of workers, then the measured productivity of the worker will depend on the nature of that match. This difficulty has serious implications for measuring discrimination and is another manifestation of the difficulty of identifying discrimination pointed out by Stephen G. Donald and Daniel S. Hamermesh (2006).
So what have the author’s really found? Umps favor, ever so slightly, pitchers of the same race. The effect holds up to adjustments for pitch count, inning, score, and game attendance. It diminishes with application of technology for monitoring umps. The same effects don’t arise for ump-batter matches. The effect is very slight on a percentage basis, but multiplied over thousands of pitches per season, it adds up to about 5% additional wins for a white pitcher when favored by a white ump and an ERA about 0.13 lower relative to baseline values over the 2004-2008 data sample in games without the monitoring technology.
Ok. And the big shoe dropping here would be …??? Let’s suppose these figures are right on the mark. The 0.13 improvement in ERA translates into a few thousand dollar increase in salary when plugged into the typical ERA effect on wins, and wins’ effect on revenue equations. That’s a few thousand dollar impact in a league with average salaries over $3 million dollars.
Across a variety of disciplines, many researchers aim to show that subtle biases continue to exist. My reaction tends to be, yes, and the world is still round. It’s not merely a glass half-full versus half-empty difference. It’s a glass nearly to the brim, but upon very, very close inspection, we’ve determined that it’s fractionally below the rim, too close to see with the naked eye or even with typical magnification; but with a really high-powered lens we’ve discovered a measurable distance between the rim and the water level. As economists, we frequently assert and explain how zero is rarely the optimal amount of anything, even undesirable things like pollution. If the standard against which bias is to measured is zero, then I suppose we haven’t reached it, and this article provides evidence — showing us that, as Jim Buchanan would say, water still runs downhill.
If, instead of just accepting these minute results as they are, questions are raised, and the authors have responded to several. However, I’m interested in a expanded approach — a broader model of umpire decision making. The controls the authors employ are worthwhile, but when such a small effect is found, leaving out other ones that may have some not-so-obvious link to race may matter. Maybe reputation (Cy Young winners) or point-of-delivery matters for umps and isn’t fully randomized across race.
Maybe of more interest, is skin color the only physical characteristic that matters to umps or other people? Rather than a model digging for bias, maybe one should be looking at a model of affinity. What draws people closer, what are commonalities? When I’m people watching at Disney World, I’m struck by how frequently tall people find tall partners, dark-haired find dark-haired partners, and so on. Do umps “like” pitchers with similar characteristics to themselves? Do they favor tall pitchers? What about heavy-set pitchers? Fast (in terms of time between pitches) pitchers, neatly groomed, …?
What about the pitcher-hitter match? The authors briefly address this, and refer to tests where they took account of hitter-ump matches, but this is a very unsatisfying response. If we are going to call favoritism across pitchers racial bias, what are we going to call the implicit within-race favoritism of the pitcher against the same-race hitter?
Racial “biases” and racial “differences”? As I expressed in my NBA referee-bias comments, there is a curious and frustrating constraint that hangs over economics and other disciplines. Topics that involve “biases,” even very small ones, find quality publication outlets. However, other areas where much more pronounced racial differences exist, such as the huge shift in the racial composition of NBA All-Defensive players, Olympic sprinters, or NFL cornerbacks and safeties, are completely taboo. It would be interesting to see economists of the quality of the MLB ump paper take on these race-related questions.
(Thanks to one of my long time friends for calling my attention to the Freakonomics post)