Can Artificial Intelligence, ChatGPT, Predict the Kentucky Derby winner?

Ashley Anderson

May 6th, 2023

Can A.I. Predict the Winner in the Kentucky Derby?

  • A.I. cannot simply predict the future
  • You can teach A.I. to build a predictive model for the Kentucky Derby
  • A.I. can narrow down the most likely Kentucky Derby winner
  • A.I. cannot predict the winner, but it can help you figure out win probability

In 2022, the 148th running of the Kentucky Derby (G1) resulted in a shocking upset win by 80-1 longshot Rich Strike that even an advanced artificial intelligence program would be hard-pressed to predict.

Hardly anyone had the Keen Ice son on their radar, especially as he drew into the 20-horse field just the day before, after Ethereal Road scratched. But with the ideal pace setup for the deep closer and a near-perfect trip by jockey Sonny Leon, Rich Strike crossed the finish line first, three-quarters of a length in front of post-time favorite Epicenter, who finished second.

Those lucky bettors who pegged Rich Strike for a win were more likely to have wagered on the horse because they were fond of his name or felt like betting the longest shot in the field “just for fun.” Finding an expert handicapper who genuinely believed the horse had a chance to prevail was virtually impossible.

Massive longshots winning the Derby is fairly uncommon over the course of the event’s history, so how can you pinpoint value and find a diamond in the rough like Rich Strike?

With the advancement of artificial intelligence, we decided to test out the theory that A.I. could forecast the outcome of the Kentucky Derby.

Employing the A.I. chatbot ChatGPT, we used the 2005 Kentucky Derby as our test subject to see if the program could predict another longshot winner — 50-1 Giacomo — as the victor of that year’s Derby.

*Note, the event data of ChatGPT is limited to September 2021, so we were not able to ask it about the 2022 Kentucky Derby.

Below you will find what we uncovered when determining whether A.I. can predict the winner of the Kentucky Derby.

A.I. cannot simply predict the future  

Open up ChatGPT and ask it to project the winner of the Derby, and you’ll instantly receive a message denying your request.

As the A.I. program will tell you, it cannot predict future events or provide information beyond its knowledge cut-off date of 2021.

A fussy ChatGPT reply.

But do not be deterred. There’s a way to work around this.

You can teach A.I. to build a predictive model for the Kentucky Derby

As ChatGPT explains, there are many factors that affect the outcome of the race, including the skill of the jockey, track condition, health of the horse, and weather.

So, to test whether certain factors could predict the winner of a previous Derby, I began working backward with ChatGPT to teach it how to learn various data points and create a model to test which horses had the highest probability of claiming the first leg of the 2005 Triple Crown.

I began by asking ChatGPT to recite the full field from the 2005 Kentucky Derby, including their post positions.

Here’s where I ran into a new obstacle. The bot spit back the incorrect post position order for that particular race, so I had to tell it to learn the correct data set.

I next asked ChatGPT to tell me the names of the trainers and jockeys associated with each horse at the time of the Derby, on May 7, 2005.

Again, I ran into some inaccuracies and had to teach the A.I. to learn the correct connections associated with a few of the horses in the field.

From there, I asked ChatGPT to tell me the historical win percentages of each post position and to create a table with this new information, along with each horse, jockey, trainer, and post position.

I then asked ChatGPT to name the race record of each horse leading up to May 7, 2005.

I also asked it what the most important factor was in predicting a Derby winner, to which it replied “previous performances in graded stakes races.”

Here, I asked ChatGPT to recite each horse’s past performances in graded stakes. Unfortunately, a few of the horses had incomplete data and the program did not list their full history in graded stakes races leading up to the 2005 Kentucky Derby.

This is another set of data that ChatGPT would need to be taught in order to provide it the most comprehensive and accurate data set with which to build a predictive model.

Finally, I asked ChatGPT to provide the Brisnet Speed figures each horse recorded leading up to the Kentucky Derby in 2005.

I ultimately asked the A.I. program to create a model that would predict the winner of the 2005 Kentucky Derby using the horse’s top Brisnet Speed figure, graded stakes performances, and trainer and jockey information. ChatGPT then produced a series of code in Python to perform a simulation to identify a prospective winner.  

Coding in python, ingesting data and running predictive simulation models.

A.I. can narrow down the most likely Kentucky Derby winner

Once I had the model, I asked ChatGPT to run a number of simulations, beginning with 100, then expanding to 1,000, and ultimately 10,000 — the maximum number this version of ChatGPT could run.

To my surprise, upon running 1,000 simulations, one name in particular stuck out.

While the post-time favorite, Bellamy Road — who finished seventh in the Kentucky Derby following his 17 1/2-length triumph in the Wood Memorial — was named the most likely winner of the 2005 Derby, 50-1 longshot Giacomo recorded 105 wins in the simulation.

Also of note, Afleet Alex, a 4.50-1 post-time choice of the Derby, scored 209 wins in the simulation. Afleet Alex went on to finish third in the Derby and proceeded to win both the Preakness and Belmont.

ChatGPT AI used uploaded data and a custom model to provide 2005 Kentucky Derby results.

When running 10,000 simulations, Bellamy Road remained the most likely winner, while Giacomo's highest probability of winning was 6.22%, which ranked him as the fifth most likely winner.

That’s still a much higher win probability than his 50-1 odds implied at post time.

While A.I. did not correctly predict Giacomo as the most likely winner, it did indicate it had a much better chance of winning than the betting public believed.

A.I. cannot predict the winner, but it can help you figure out win probability

Thus, A.I. may in fact help you pinpoint a winner of the Kentucky Derby, but it comes with a caveat. If you’re hoping to simply ask ChatGPT, or another A.I. program, to predict the outcome of the upcoming Derby, you’ll be met with resistance.

You’ll only get what you put into the program, so you’ll be required to feed the A.I. the necessary data in order for it to produce a predictive model; and even then, you may uncover errors in the system.

This also underscores another issue with A.I. that even an expert handicapper faces — knowing which data is most important to predicting the winner of the Kentucky Derby.

Our model utilizes past performances in graded stakes, Brisnet Speed figures, trainer and jockey info, and post position success. But other horseplayers may want to create a model focusing on margin of victory in races leading up to the Derby; most recent Brisnet Speed figures as opposed to best Speed figures ahead of the race; or farthest distance run before May 7, 2005.

All of this is to say, you will need to put as much, if not more, time into building a predictive model through A.I. as you would if thoroughly handicapping the Kentucky Derby program.

And in the end, all of your efforts may still lead you to backing a horse that was ultimately outrun by an unsuspecting rival who happened to get the ideal pace setup and a near-perfect trip by their jockey on Derby Day.

Science Museum Robots (Photo courtesy of Wikimedia Commons)