# The Strike Zone as a Probability Density Function

The strike zone in baseball is supposed to be the area above the plate and between the batter’s armpits and knees.  Seems simple enough, but  John Palmer suggests that it is more complicated than that.

Several days go, while watching some very erratic home-plate umpiring, I posted a short item to Facebook quoting myself, “The strike zone is a probability density function.” I had originally made the statement during a play-by-play radio broadcast of a London Tigers baseball game over 20 years ago. The Tigers were a AA minor league team. The radio station manager asked me not to do that again and not to try to explain it on air.

This is an interesting way to think about the strike zone.  What John is implying is that the probability of a strike being called within the strike zone is determined by where it is in the strike zone.  In particular, John suggests that the closer the pitch is to the center of the zone, the more likely it will be called a strike.  He links to this post at FiveThirtyEightSports written by Etan Green that supports John’s claim.  The following graph from the article illustrates John’s belief about the strike zone.

Over at Facebook, King Banaian commented

(T)here is not just one strike zone. The dimensions are not just spatial but umpire specific. So even your three dimensional depiction of the PDF isn’t quite right, and I question whether summing up along the umpire dimension gives us anything really meaningful about the strike zone.”

The strike zone may also be situation-dependent.  For example, an umpire may slightly expand his strike zone if he realizes that he called an earlier ptch a ball when he should have called a strike.

Lastly, in watching my sons play baseball over the years, there is definitely a difference in strike zones between youth baseball and pro baseball.  In youth ball, the strike zone tends to move around more.  In the pros, the strike zone seems more consistent to my untrained eye.

## Author: Phil Miller

Published on:

Published in:

### 3 thoughts on “The Strike Zone as a Probability Density Function”

1. I was watching MLBN last night and one of the commenters — might have been Bill Ripken — noted as well that once a pitcher starts missing the plate badly, the umpire doesn’t give the benefit of the doubt on subsequent pitches that were “borderline” (“lines have no width” — Ed.) I think there’s some Markov process you could use to measure this if you were interested, but what would be the more generalizable lesson of that?

2. One more thought — a few nights ago I’m watching a game where the starting pitcher is throwing in the 9th inning, ahead 5 runs, and a torrential rain comes. Gives up a leadoff home run, then gets two outs. Announcers assume the umpire is going to call darn near anything a strike, but no: Batter gets a 3-1 count then singles, next guy walks. In a pouring rain. At least two pitches, the pitcher is glaring at the umpire. Reliever comes in, gets final out on 2 pitches. What do we suppose was going on there?

3. MLB is generally more consistent, as it must be, and it is far more evaluated with more cameras, more eyeballs, more scrutiny, paid umps, and announcers/commentators.

Overall, I feel that umps are more likely to call a pitch based on the current count and score – they try to even it up a little.
If a count is 0-2, then they are less likely to call a strike, especially in non pros.

If one team is up, they may call pitches to either even it out, or to just show mercy and get it over with.

It is not just pitches, but every human judgement call. Were they safe or out? Is it a bunt or a dead ball? Did they round first, or not?

Same with statisticians. Was it a throwing error, or bad catch?

I know that professional ups/refs go to school, review questionable calls, get called out by players, coaches, leagues, fans, commentators, etc. While I’m sure there is some bias, is there bias in the judgement of bias?

Lies, damned lies and statistics.