MLB Umps Not So Color Blind?

The racial bias analysis of strikes-balls decisions of MLB umps by UT-Austin professor Daniel Hamermesh and coauthors (Parsons, Sulaeman, Yates) has generated the expected buzz in the popular press and on baseball blogs (Google Search results; Time article). See Hamermesh’s post on Freakonomics and link to the AER article).

As one would expect, given Hamermesh’s past reputation and standing, this is not a lightweight effort. It’s creative and utilizes data that many sports researchers would covet. Valuable research is about the questions and attempts to answer them as it is about the answers. This piece has and will generate follow-on work.

The authors make some substantial, albeit qualified, claims:

The results allow us to think about the deeper question of measuring discrimination generally. If, as we show here, the match to the race/ethnicity of their evaluator affects evaluations of workers, then the measured productivity of the worker will depend on the nature of that match. This difficulty has serious implications for measuring discrimination and is another manifestation of the difficulty of identifying discrimination pointed out by Stephen G. Donald and Daniel S. Hamermesh (2006).

So what have the authors found? Umps, favor, ever so slightly, pitchers of the same race. The effect holds up to adjustments for pitch count, inning, score, and game attendance. It diminishes with the application of technology for monitoring umps. The same effects don’t arise for ump-batter matches. The result is very slight on a percentage basis. Still, multiplied over thousands of pitches per season adds up to about 5% additional wins for a white pitcher when favored by a white ump and an ERA about 0.13 lower relative to baseline values over the 2004-2008 data sample in games without the monitoring technology.

Ok. And the big shoe dropping here would be …??? Let’s suppose these figures are right on the mark. The 0.13 improvement in ERA translates into a few thousand dollar increase in salary when plugged into the typical ERA effect on wins and wins’ impact on revenue equations. That’s a few thousand dollar impact in a league with average salaries over 3 million dollars.

Across various disciplines, many researchers aim to show that subtle biases continue to exist. Yes, my reaction tends to be, and the world is still around. It’s not merely a glass-half-full versus half-empty difference. It’s a glass nearly to the brim. Still, upon very close inspection, we’ve determined that it’s fractionally below the rim, too close to see with the naked eye or even with typical magnification. Still, we’ve discovered a measurable distance between the edge and the water level with a high-powered lens. As economists, we frequently assert and explain how zero is rarely the optimal amount of anything, even undesirable things like pollution. If the standard against which bias is to measured is zero, then I suppose we haven’t reached it, and this article provides evidence — showing us that, as Jim Buchanan would say, water still runs downhill.

If, instead of just accepting these minute results as they are, questions are raised, and the authors have responded to several. However, I’m interested in an expanded approach — a broader model of umpire decision-making. The authors’ controls are worthwhile, but when such a small effect is found, leaving out other ones that may have some not-so-obvious link to race may matter. Maybe reputation (Cy Young winners) or point-of-delivery matters for umps and isn’t fully randomized across races.

Maybe of more interest, is skin color the only physical characteristic that matters to umps or other people? Rather than a model digging for bias, perhaps one should be looking at a model of affinity. What draws people closer? What are commonalities? When I’m people watching at Disney World, I’m struck by how frequently tall people find tall partners, dark-haired find, dark-haired partners, etc. Do umps “like” pitchers with similar characteristics to themselves? Do they favor tall pitchers? What about heavy-set pitchers? Fast (in terms of time between pitches) pitchers, neatly groomed, …?

What about the pitcher-hitter match? The authors briefly address this and refer to tests where they took account of hitter-ump battles, but this is a very unsatisfying response. If we are going to call favoritism across pitchers racial bias, what are we going to call the implicit within-race favoritism of the pitcher against the same-race hitter?

Racial “biases” and racial “differences”? As I expressed in my NBA referee-bias comments, a curious and frustrating constraint hangs over economics and other disciplines. Topics that involve “biases,” even tiny ones, find quality publication outlets. However, other areas where much more pronounced racial differences exist, such as the considerable shift in the racial composition of NBA All-Defensive players, Olympic sprinters, or NFL cornerbacks and safeties, are entirely taboo. It would be interesting to see economists of the quality of the MLB ump paper take on these race-related questions.

(Thanks to one of my long time friends for calling my attention to the Freakonomics post)

Author: Brian Goff

Get in touch

Navigate