Trainers’ Intentions – Jockey Bookings
By Henry Young on Friday, May 20th, 2022Trainers tell us a lot about their runners’ chances of winning through factors such as race selection and jockey bookings. For example, trainers will sometimes run horses on unsuitable going, or at an unsuitable course or over an unsuitable distance with the intention of obtaining a handicap advantage which they plan to cash in on with a future race. I see this happening a lot in May-July, presumably with a view to winning the larger prizes available later in the flat season.
Information about trainers’ intentions can also be inferred from their jockey bookings. Most trainers have a few favoured jockeys who do the majority of their winning, with the remainder necessarily below average. This shows up in the trainer/jockey combo stats which are easily computed using Smartform, as we will see shortly. For example, Luke Morris – legendary as one of the hardest working jockeys with 16,700 races ridden since 2010 – has ridden most races for Sir Mark Prescott (1,849) at a strike rate of 19%. While he has higher strike rates with a few other trainers, these carry far less statistical significance. But you’ll often see him riding outsiders for a wide range of other trainers. The take away is never lay the Prescott/Morris combo !
When I assess a race, I find myself asking “Why is this jockey here today?”. If you assess not the race but each jockey’s bookings on the day, it give an interesting alternative perspective, clearly differentiating between regular or preferred trainer bookings and “filler” bookings. This latter variety of booking is normally associated with lower strike rates and therefore can be used as a factor to consider when analyzing a race, particularly for lay candidates.
Before diving into how to do this analysis, let me give you an example from Redcar on 18th April 2022 as I write this. While analyzing the 14:49 race, I became interested in one particular runner as a lay candidate and asked the question “Why is this jockey here today?”. In the following table you can see each of the jockey’s engagements for that day with the trainer/jockey combo strike rate (for all races at all courses since 2010):
+---------------------+--------+----------------+-------------+-------------------+-----------+----------+-------+
| scheduled_time | course | runner_name | jockey_name | trainer_name | RaceCount | WinCount | SR_TJ |
+---------------------+--------+----------------+-------------+-------------------+-----------+----------+-------+
| 2022-04-18 13:39:00 | Redcar | Mahanakhon | D Allan | T D Easterby | 4225 | 499 | 11.8 |
| 2022-04-18 14:14:00 | Redcar | Mr Moonstorm | D Allan | T D Easterby | 4225 | 499 | 11.8 |
| 2022-04-18 14:49:00 | Redcar | The Golden Cue | D Allan | Steph Hollinshead | 14 | 1 | 7.1 |
| 2022-04-18 15:24:00 | Redcar | Ugo Gregory | D Allan | T D Easterby | 4225 | 499 | 11.8 |
| 2022-04-18 15:59:00 | Redcar | Hyperfocus | D Allan | T D Easterby | 4225 | 499 | 11.8 |
| 2022-04-18 16:34:00 | Redcar | The Grey Wolf | D Allan | T D Easterby | 4225 | 499 | 11.8 |
| 2022-04-18 17:44:00 | Redcar | Eruption | D Allan | T D Easterby | 4225 | 499 | 11.8 |
+---------------------+--------+----------------+-------------+-------------------+-----------+----------+-------+
The answer is clearly that D Allan was at Redcar to ride for Tim Easterby and had taken the booking for Steph Hollinshead as a “filler”. “The Golden Cue” subsequently finished 17th out of 20 runners.
Performing this analysis on a daily basis is a two stage process. First, we need to calculate trainer/jockey combo strike rates. Then we use that data in combination with the daily races data to list each of a specific jockey’s bookings for the day. Let’s start with the last part to warm up our SQL mojo:
SELECT
scheduled_time,
course,
name as runner_name,
jockey_name,
trainer_name
FROM
daily_races_beta
JOIN
daily_runners_beta
USING (race_id)
WHERE
meeting_date = "2022-04-18" AND
jockey_name = "D Allan"
ORDER BY scheduled_time;
This gets us the first 5 columns of the table above, without the strike rate data:
+---------------------+--------+----------------+-------------+-------------------+
| scheduled_time | course | runner_name | jockey_name | trainer_name |
+---------------------+--------+----------------+-------------+-------------------+
| 2022-04-18 13:39:00 | Redcar | Mahanakhon | D Allan | T D Easterby |
| 2022-04-18 14:14:00 | Redcar | Mr Moonstorm | D Allan | T D Easterby |
| 2022-04-18 14:49:00 | Redcar | The Golden Cue | D Allan | Steph Hollinshead |
| 2022-04-18 15:24:00 | Redcar | Ugo Gregory | D Allan | T D Easterby |
| 2022-04-18 15:59:00 | Redcar | Hyperfocus | D Allan | T D Easterby |
| 2022-04-18 16:34:00 | Redcar | The Grey Wolf | D Allan | T D Easterby |
| 2022-04-18 17:44:00 | Redcar | Eruption | D Allan | T D Easterby |
+---------------------+--------+----------------+-------------+-------------------+
There are two approaches to generating the trainer/jockey strike rate data – using a sub-query or creating a new table of all trainer/jockey combo strike rates. I normally take the latter approach because I use this type of data many times over during the day. I therefore have this setup as a stored procedure which is run daily after the database update. The following SQL is executed once to (re)define the stored procedure called refresh_trainer_jockeys
:
delimiter //
DROP PROCEDURE IF EXISTS refresh_trainer_jockeys//
CREATE
DEFINER='smartform'@'localhost'
PROCEDURE refresh_trainer_jockeys()
COMMENT 'Regenerates the x_trainer_jockeys table'
BEGIN
DROP TABLE IF EXISTS x_trainer_jockeys;
CREATE TABLE x_trainer_jockeys (
trainer_name varchar(80),
trainer_id int(11),
jockey_name varchar(80),
jockey_id int(11),
RaceCount int(4),
WinCount int(4),
WinPct decimal(4,1));
INSERT INTO x_trainer_jockeys
SELECT
sub.trainer_name,
sub.trainer_id,
sub.jockey_name,
sub.jockey_id,
sub.RaceCount,
sub.WinCount,
percentage(sub.RaceCount, sub.WinCount, 20, 1) AS WinPct
FROM (
SELECT
ANY_VALUE(trainer_name) AS trainer_name,
trainer_id,
ANY_VALUE(jockey_name) AS jockey_name,
jockey_id,
COUNT(*) AS RaceCount,
COUNT(IF(finish_position = 1, 1, NULL)) AS WinCount
FROM
historic_races_beta
JOIN
historic_runners_beta
USING (race_id)
WHERE
YEAR(meeting_date) >= 2010 AND
trainer_id IS NOT NULL AND
jockey_id IS NOT NULL AND
NOT (unfinished <=> "Non-Runner")
GROUP BY jockey_id, trainer_id) AS sub
ORDER BY sub.trainer_name, sub.jockey_name;
END//
delimiter ;
When the stored procedure refresh_trainer_jockeys
is run, a new table x_trainer_jockeys
is created. This new table is very useful for queries such as:
SELECT * FROM x_trainer_jockeys WHERE trainer_name = "W J Haggas" ORDER BY WinPct DESC LIMIT 20;
which results in the strike rates for William Haggas’ top 20 jockeys (for all races at all courses since 2010):
+--------------+------------+---------------+-----------+-----------+----------+--------+
| trainer_name | trainer_id | jockey_name | jockey_id | RaceCount | WinCount | WinPct |
+--------------+------------+---------------+-----------+-----------+----------+--------+
| W J Haggas | 1804 | S Sanders | 4355 | 70 | 25 | 35.7 |
| W J Haggas | 1804 | B A Curtis | 45085 | 109 | 35 | 32.1 |
| W J Haggas | 1804 | L Dettori | 2589 | 90 | 28 | 31.1 |
| W J Haggas | 1804 | D Tudhope | 27114 | 225 | 69 | 30.7 |
| W J Haggas | 1804 | T P Queally | 18614 | 49 | 15 | 30.6 |
| W J Haggas | 1804 | G Gibbons | 18303 | 80 | 24 | 30.0 |
| W J Haggas | 1804 | A Kirby | 31725 | 28 | 8 | 28.6 |
| W J Haggas | 1804 | R Kingscote | 31611 | 79 | 22 | 27.8 |
| W J Haggas | 1804 | Jim Crowley | 43145 | 270 | 74 | 27.4 |
| W J Haggas | 1804 | S De Sousa | 41905 | 84 | 23 | 27.4 |
| W J Haggas | 1804 | R L Moore | 18854 | 360 | 91 | 25.3 |
| W J Haggas | 1804 | P Hanagan | 16510 | 355 | 88 | 24.8 |
| W J Haggas | 1804 | A J Farragher | 1165464 | 53 | 13 | 24.5 |
| W J Haggas | 1804 | J Fanning | 2616 | 128 | 31 | 24.2 |
| W J Haggas | 1804 | J Doyle | 31684 | 455 | 105 | 23.1 |
| W J Haggas | 1804 | Oisin Murphy | 1151765 | 66 | 15 | 22.7 |
| W J Haggas | 1804 | M Harley | 74041 | 76 | 17 | 22.4 |
| W J Haggas | 1804 | Tom Marquand | 1157192 | 705 | 157 | 22.3 |
| W J Haggas | 1804 | R Hughes | 1764 | 81 | 18 | 22.2 |
| W J Haggas | 1804 | P Cosgrave | 17562 | 564 | 121 | 21.5 |
+--------------+------------+---------------+-----------+-----------+----------+--------+
We’re now ready to use x_trainer_jockeys
to complete the answer to our question “Why is this jockey here today?”:
SELECT
scheduled_time,
course,
name as runner_name,
daily_runners_beta.jockey_name,
daily_runners_beta.trainer_name,
coalesce(RaceCount, 0) as RaceCount,
coalesce(WinCount, 0) as WinCount,
coalesce(WinPct, 0.0) as SR_TJ
FROM
daily_races_beta
JOIN
daily_runners_beta
USING (race_id)
LEFT JOIN
x_trainer_jockeys
USING (trainer_id,jockey_id)
WHERE
meeting_date = "2022-04-18" AND
daily_runners_beta.jockey_name = "D Allan"
ORDER BY scheduled_time;
Which comes full circle, giving us the table at the top of the blog !