American Pharoah is the odds-on favorite to win Saturday's G1 Belmont Stakes, but the last 12 Triple Crown contenders failed to win the Belmont despite similar odds. Paul von Hippel, assistant professor at the University of Texas and a former data scientist, uncovers an overlooked variable that has reduced the favorites' Belmont chances in recent decades. What are the implications for American Pharoah?
American Pharoah, who has won both the Kentucky Derby and the Preakness Stakes, is now given better than even odds to win the Belmont Stakes and the Triple Crown. But those odds are pretty hard to believe. After all, since Affirmed in 1978, there have been 12 previous horses who started the Belmont after winning the Derby and Preakness, and most of those horses were given even odds or better. If the odds were an accurate prediction of the race, then about 6 of those horses should have won. But none did. If the odds were fair, the chances of all 12 favorites losing would be more than 40,000-to-1.
So the odds haven’t been accurate. The betting public has consistently overestimated the favorite’s chances of winning the Triple Crown, and despite 36 years of disappointment, the public shows no signs of wising up.
Why is the general betting public so incorrigibly optimistic about Triple Crown bids? One possibility is media coverage. “The hype of having a chance to watch a Triple Crown winner is every bit as important as actually having it happen,” trainer Ken McPeek said after the 2002 Belmont Stakes, when his Sarava dashed War Emblem’s hopes for a Triple Crown. The Belmont television audience for California Chrome’s Triple Crown bid in 2014 swelled to 21 million – three times as many viewers as watched the race the year before. Casual observers may bet the favorite because the media has trumpeted his chances, or because their ticket would be a keepsake if he happens to win the Triple Crown. Either way, an influx of money from enthusiastic but less informed gamblers may have helped to inflate the odds on past favorites well beyond what they realistically deserved.
You can’t count on the betting markets to accurately predict American Pharoah’s chances. So what should his odds be?
To answer that, we have to ask why the Belmont has grown so hard to win.
Why has the Belmont grown so hard to win?
The difficulty that modern favorites have in completing the Triple Crown is puzzling because it wasn’t that hard once. Back in the 1970s, three out of five horses that won the Derby and Preakness went on to win the Belmont. It was even easier in the 1930s and ‘40s, when seven of eight Derby-Preakness winners completed the Triple Crown. The only precedent for the modern drought is 1949-1972, when seven horses won both the Derby and Preakness but none of them won the Belmont.
Why has it become so hard for Derby and Preakness winners to pick up the Belmont? Why was it easier in the 1930s, ‘40s, and ‘70s, but hard in the 1950s and ‘60s?
Two explanations are usually offered at this point. One is that the Belmont, at 1½ miles, is the longest race in the Triple Crown series. The other is that, after the Preakness, horses get just three weeks’ rest before the Belmont.
Those challenges are real, but they’re not new. The Belmont went to 1½ miles in 1926, and that didn’t stop seven horses from winning the Triple Crown in the 1930s and ‘40s. The Belmont has been run on three weeks’ rest since 1969, and that didn’t stop three horses from winning the Triple Crown in the 1970s.
Maybe Thoroughbreds have less endurance now than they did in the 1970s. Maybe they need more rest. Maybe stables that don’t win the Derby have gotten smarter about skipping the Preakness and spoiling the Belmont.
Or maybe another explanation has been overlooked. Race photos suggests a simple possibility.
Field size
Here’s Affirmed nosing out Alydar (left) to win the 1978 Belmont Stakes and most recent Triple Crown. If you look carefully in the background, you’ll notice that there are only five horses on the track.
Now here’s last year’s Belmont, where California Chrome could have won the Triple Crown but finished fourth. Look at the starting gate. There are eleven horses.
These pictures are part of a trend. Here’s a scatterplot of the number of starters in every Belmont for the past 100 years.
The average field was two horses smaller in the 1970s than it is today, and two to three horses smaller in the 1930s-’40s than in the 1970s. What’s more, in years with a Triple Crown winner, the field was even smaller than was typical for the time. Among the 11 Triple Crown winners, two (including the first, Sir Barton) won the Belmont in a field of three. Two won in a field of four, and three won in a field of five. No Triple Crown contender has ever been upset in a Belmont field smaller than six, and none has ever succeeded in a field larger than eight.
Size of field doesn’t just help to explain the drought we’re in now. It also helps to explain the drought of 1949-1972. During that period, out of seven horses that won the Derby and Preakness, only one got to face a Belmont field that was smaller than eight. The greatest challenge occurred in 1971 when Cañonero II, after winning the Derby and Preakness, had to run the Belmont in a field of 13 horses. This was just two years before Secretariat won the Belmont as the favorite in a field of five. Secretariat was a better horse than Cañonero II, but he also had less competition.
Why does size of field matter? It’s much harder to finish first of 11 than first of five. In a field of 11 there are 10 possible winners besides the favorite, and it’s easier for any horse, including the favorite, to get bumped or trapped. Jockeys can try to avoid traffic by going outside or out front -- but those options have risks of their own.
There have been 30 years when a single horse won the Derby and Preakness and entered the Belmont.[i] In 11 of those years, typically in small fields, the horse won the Triple Crown; in the other 19 years, typically in large fields, he was upset. Using data from those 30 years, this graph shows the relationship[ii] between the size of the Belmont field and the probability of an upset.
With a three-horse field -- for example in 1919 or 1943, when Sir Barton and Count Fleet won the Triple Crown -- the probability of an upset is just 0-28 percent. With a 5-horse field -- for example in 1973 or 1978, the years when Secretariat and Affirmed won -- the chances of an upset are 2-57 percent. If that 57 percent figure seems high to you, remember: Affirmed almost lost.
You might imagine the predictions would improve if they used factors beyond size of field. Surprisingly, other factors don’t help much. One thing that people like about American Pharoah is that he won the Preakness by seven lengths. But he only won the Derby by a single length, and it turns out that winning margins aren’t very predictive. Once size of field is accounted for, knowing the Derby and Preakness winning margins doesn’t make the predictions significantly more accurate.
A final thing that doesn’t improve predictions is post position. American Pharoah’s won’t be assigned a post position until Wednesday, June 3, but once he has one, his chances won’t be any clearer. More horses have won the Belmont from low post positions than from high ones, but that could easily be because of field size. Even in the smallest fields, there are always horses in post positions 1, 2, and 3. But only a large field will have a horse at post position 10.
Another factor to consider is weeks of rest. Since 1969 horses have had three weeks’ rest between the Preakness and Belmont. Before 1969 they had four weeks rest in most years, though there were years before 1969 with three weeks rest, or even two.[iii] But of course fields were smaller before 1969, too. Once size of field is accounted for, knowing weeks or rest barely improved the predictions of Belmont upsets. And of course weeks of rest can’t explain why Triple Crown success was more common in the 1970s than it is today. The Triple Crown champions of the 1970s were running on three weeks’ rest, too. But they had smaller fields.
Fair odds on American Pharoah
A week ago, the 2015 Belmont had 11 probable entrants. No horse has completed the Triple Crown in a Belmont field that large, and the chances of an upset were 75 to 91 percent. But in the last few days, withdrawals have reduced the race to a probable field of eight. That drops the chances of an upset to 53 to 87 percent. In other words, fair odds on American Pharoah would be somewhere between 11-10 and 7-1.
That estimate of fair odds is still longer than what bettors are being offered. According to sportsbook.ag, American Pharoah is at odds of 5-6, at the time of writing, implying that he has a 55 percent chance of completing the Triple Crown and a 45 percent chance of being upset. The general public may still be on the optimistic side about American Pharoah, but a shrinking field has brought the probabilities closer to justifying their optimism.
--------
[i] The data don’t include 1932, 1936, and 2012, when the horse that won the Derby and Preakness didn’t start the Belmont. [i][i] They also don’t include 1968, when the same horse “won” the Derby and Preakness, but only because the first place Derby finisher was disqualified.
[ii] For those who like to know these things: the best fitting prediction model was p=min(1-1/starters,p*), where logit(p*)= a+b logit(1/starters).
[iii] In the years when the same horse won the Derby and Preakness, those horses had 2 weeks before the Belmont in 1961 and 1966, three weeks in 1930, 1944, 1946, 1958, 1964, and 1969 on, and four weeks in other years before 1969.
---------
Paul von Hippel is an assistant professor at the University of Texas, where he works in the LBJ School of Public Affairs and is affiliated with the Department of Statistics and Data Science. Before coming to Texas, he developed predictive models to reduce financial risk in the banking industry. He enjoys racing, and finance, and squeezing new insights from data.