Adding data to Smartform
By Henry Young on Friday, December 23rd, 2022Using Smartform typically involves querying and aggregating data by various criteria. For example you may wish to look into trainer strike rate by course and race type, in effect regenerating the same data that used to be painstakingly manually compiled by Ken Turrell in his annual publication Trainers4Courses:
Ken no longer produces his book, but the key data can be generated from Smartform through a single SQL query:
SELECT
sub.course,
sub.race_type,
sub.race_type_id,
sub.trainer_name,
sub.trainer_id,
sub.RaceCount,
sub.WinCount,
percentage(sub.RaceCount, sub.WinCount, 30, 1) AS WinPct
FROM (
SELECT
course,
ANY_VALUE(race_type) AS race_type,
race_type_id,
ANY_VALUE(trainer_name) AS trainer_name,
trainer_id,
COUNT(*) AS RaceCount,
COUNT(IF(finish_position = 1, 1, NULL)) AS WinCount
FROM
historic_races_beta
JOIN
historic_runners_beta USING (race_id)
GROUP BY trainer_id, course, race_type_id) AS sub
ORDER BY sub.course, sub.race_type_id, sub.trainer_name;
If you feed the results into a new table, which requires some additional syntactic sugar along the lines of a previous blog post, you can then pull trainer strike rates per course into your custom race cards. Doing so requires a SQL table join between the new table, daily_races_beta
and daily_runners_beta
, joining on course
and race_type_id
. But when you try to do that table join, you’ll discover that race_type_id
is not available in daily_races_beta
(or its forerunner daily_races
). The same information is implied by a combination of race_type
and track_type
. The easiest option is to add this column into the original Smartform table and populate it yourself.
When adding a new column to an existing Smartform table, there is the obvious concern that this may be overwritten by the daily updates, and certainly remain unpopulated by those updates. The answer is to add the column with DEFAULT NULL
and then run a script daily after the updater has completed. That script should add the new column if required, and populate any NULL
values. Next we look into the details of how to achieve this.
The first thing we should do is to create a stored function that generates a value for race_type_id
based on race_type
and track_type
. Let’s call that function get_race_type_id
.
delimiter //
DROP FUNCTION IF EXISTS get_race_type_id//
CREATE FUNCTION get_race_type_id(race_type varchar(30), track_type varchar(30)) RETURNS smallint
COMMENT 'convert race_type/track_type strings to race_type_id'
DETERMINISTIC
NO SQL
BEGIN
RETURN
CASE race_type
WHEN 'Flat' THEN if (track_type = 'Turf', 12, 15)
WHEN 'Hurdle' THEN 14
WHEN 'Chase' THEN 13
WHEN 'National Hunt Flat' THEN 1
WHEN 'N_H_Flat' THEN 1
WHEN 'All Weather Flat' THEN 15
END;
END//
delimiter ;
You can see that the function is obviously handling some peculiarities of how the data appears in Smartform. With this function defined we can proceed to define the SQL stored procedure which can be called daily after the Smartform updater – a process that can easily be automated by using a scheduled script according to whichever operating system you’re using:
-- Check if the column 'race_type_id' exists in table 'daily_races_beta'
-- Add the missing column if necessary
-- Populate values based on 'race_type' and 'track_type'
delimiter //
DROP PROCEDURE IF EXISTS fix_race_type_id//
CREATE PROCEDURE fix_race_type_id()
COMMENT 'Adds race_type_id to daily_races_beta if necessary and populates NULL values'
BEGIN
IF NOT EXISTS (
SELECT * FROM information_schema.columns
WHERE column_name = 'race_type_id' AND
table_name = 'daily_races_beta' AND
table_schema = 'smartform' )
THEN
ALTER TABLE smartform.daily_races_beta
ADD COLUMN race_type_id SMALLINT UNSIGNED DEFAULT NULL;
END IF;
SET SQL_SAFE_UPDATES = 0;
UPDATE daily_races_beta
SET race_type_id = get_race_type_id(race_type, track_type)
WHERE race_type_id IS NULL;
SET SQL_SAFE_UPDATES = 1;
END//
delimiter ;
The most complicated aspect of the above is the hoop you have to jump through in MySQL to discover if a column exists – querying information_schema.columns
. Do we really need to do this every time ? No – but it’s a useful fail-safe in case the database has to be reloaded from bulk download at any stage.
In conclusion, adding new columns to existing Smartform tables is sometimes the most convenient solution to a requirement and can be done safely if you follow a few simple rules.
Error Code: 1305
FUNCTION smartform.percentage does not exist
above error generated when run in SQL editor.
Searched blog and MySql website – no percentage function exists.
Sorry if this appears a stupid question.
Well spotted – I left out the definition of that function (below):
delimiter //
CREATE FUNCTION percentage(tot INT, val INT, min INT, dp INT) RETURNS decimal(4,1)
COMMENT ‘return percentage or null if tot less than min’
DETERMINISTIC
NO SQL
BEGIN
RETURN round(if(tot >= min, (val / tot) * 100, NULL), dp);
END//
delimiter ;
Note that the idea here is to exclude aggregations that are not statistically significant, which in this case I take to me less than 30 applicable races. This ends up being more useful for laying than backing. You may choose to reduce that threshold or even set it to zero if your end objective differs.