Archive for December, 2022

Adding data to Smartform

Friday, December 23rd, 2022

Using Smartform typically involves querying and aggregating data by various criteria. For example you may wish to look into trainer strike rate by course and race type, in effect regenerating the same data that used to be painstakingly manually compiled by Ken Turrell in his annual publication Trainers4Courses:

Ken no longer produces his book, but the key data can be generated from Smartform through a single SQL query:

SELECT
	sub.course,
	sub.race_type,
	sub.race_type_id,
	sub.trainer_name,
	sub.trainer_id,
	sub.RaceCount,
	sub.WinCount,
	percentage(sub.RaceCount, sub.WinCount, 30, 1) AS WinPct
FROM (
	SELECT
		course,
		ANY_VALUE(race_type) AS race_type,
		race_type_id,
		ANY_VALUE(trainer_name) AS trainer_name,
		trainer_id,
		COUNT(*) AS RaceCount,
		COUNT(IF(finish_position = 1, 1, NULL)) AS WinCount
	FROM
		historic_races_beta
	JOIN
		historic_runners_beta USING (race_id)
	GROUP BY trainer_id, course, race_type_id) AS sub
ORDER BY sub.course, sub.race_type_id, sub.trainer_name;

If you feed the results into a new table, which requires some additional syntactic sugar along the lines of a previous blog post, you can then pull trainer strike rates per course into your custom race cards. Doing so requires a SQL table join between the new table, daily_races_beta and daily_runners_beta, joining on course and race_type_id. But when you try to do that table join, you’ll discover that race_type_id is not available in daily_races_beta (or its forerunner daily_races). The same information is implied by a combination of race_type and track_type. The easiest option is to add this column into the original Smartform table and populate it yourself.

When adding a new column to an existing Smartform table, there is the obvious concern that this may be overwritten by the daily updates, and certainly remain unpopulated by those updates. The answer is to add the column with DEFAULT NULL and then run a script daily after the updater has completed. That script should add the new column if required, and populate any NULL values. Next we look into the details of how to achieve this.

The first thing we should do is to create a stored function that generates a value for race_type_id based on race_type and track_type. Let’s call that function get_race_type_id.

delimiter //

DROP FUNCTION IF EXISTS get_race_type_id//

CREATE FUNCTION get_race_type_id(race_type varchar(30), track_type varchar(30)) RETURNS smallint
        COMMENT 'convert race_type/track_type strings to race_type_id'
        DETERMINISTIC
        NO SQL
BEGIN
        RETURN 
                CASE race_type
                        WHEN 'Flat'                     THEN if (track_type = 'Turf', 12, 15)
                        WHEN 'Hurdle'                   THEN 14
                        WHEN 'Chase'                    THEN 13
                        WHEN 'National Hunt Flat'       THEN 1
                        WHEN 'N_H_Flat'                 THEN 1
                        WHEN 'All Weather Flat'         THEN 15
                END;
END//

delimiter ;

You can see that the function is obviously handling some peculiarities of how the data appears in Smartform. With this function defined we can proceed to define the SQL stored procedure which can be called daily after the Smartform updater – a process that can easily be automated by using a scheduled script according to whichever operating system you’re using:

-- Check if the column 'race_type_id' exists in table 'daily_races_beta'
-- Add the missing column if necessary
-- Populate values based on 'race_type' and 'track_type'

delimiter //

DROP PROCEDURE IF EXISTS fix_race_type_id//

CREATE PROCEDURE fix_race_type_id()

	COMMENT 'Adds race_type_id to daily_races_beta if necessary and populates NULL values'

BEGIN

	IF NOT EXISTS (
		SELECT * FROM information_schema.columns
		WHERE 	column_name	= 'race_type_id'	AND
			table_name	= 'daily_races_beta'	AND
			table_schema	= 'smartform'		)
	THEN
		ALTER TABLE smartform.daily_races_beta
		ADD COLUMN race_type_id SMALLINT UNSIGNED DEFAULT NULL;
	END IF;

	SET SQL_SAFE_UPDATES = 0;

	UPDATE daily_races_beta
		SET race_type_id = get_race_type_id(race_type, track_type)
		WHERE race_type_id IS NULL;

	SET SQL_SAFE_UPDATES = 1;

END//

delimiter ;

The most complicated aspect of the above is the hoop you have to jump through in MySQL to discover if a column exists – querying information_schema.columns. Do we really need to do this every time ? No – but it’s a useful fail-safe in case the database has to be reloaded from bulk download at any stage.

In conclusion, adding new columns to existing Smartform tables is sometimes the most convenient solution to a requirement and can be done safely if you follow a few simple rules.