Posts Tagged ‘SQL’

Comparing horses from different sources – the solution

Friday, May 7th, 2010

In yesterday’s post we discussed the problem of using information from different data sources for research and automated betting, where the name of the horse differs according to the data source.

The most common problems are incorrect capitalization within horse names (eg.  Sea the Stars instead of Sea The Stars) and omission, misplacement and other misdemeanors with apostrophes.

Various programming solutions are presented to this problem in Automatic Exchange Betting, but there is an even simpler solution where one of the information sources is the Smartform Racing Database.  Firstly, we know the runner names in Smartform are correct, so we can use this as our master source.   Secondly, Smartform uses straightforward SQL, which provides for many basic operations on character strings, such as conversion to lower case and pattern matching (so that we have access to search and replace functions).

This means that we can easily convert a horse name in Smartform to an equivalent name which has no capitals, no whitespace and no apostrophes.  If we do the same transformation on the name from the target data source, our correct name will match our incorrect name, and we can start to use information from both data sources in our betting strategy.  If we want to keep the correct name, we just select the correct name to be displayed but match the information on the transformed names.

If you’re not familiar with these functions in MySQL, you can download a copy and test this sort of functionality out easily without selecting any database, as in the queries below:

#Sea The Stars would normally be a variable in the database, so the query would not need quotations around the name.
>select lower("SEA THE STARS");

This will produce the name sea the stars.

Or use the replace functions to produce a name without white space (which also applies to apostrophes):

#The replace function takes three arguments separated by commas - the string to transform, the elements to replace, and the string to replace it with, as in:
>select replace("SEA THE STARS", ' ', '');

which produces the name SEATHESTARS

#Put the above functions together within one statement to produce a horse name that can be matched against another without issues:
>select replace("SEA THE STARS", ' ', '');

So at last we get seathestars.

If you’re unfamiliar with SQL, the syntax can take a little getting used to, but on the whole is a gentler introduction than learning a programming language – and allows you to achieve an awful lot when it comes to horseracing analysis.

Performing the same operation on the target horse name in another database table let’s us match data up between horses using a table join without leaving the database.  Returning to our original example from yesterday, this means, for example, we could match any form or forecast odds data in Smartform with any market data available in Betfair.  Of course, automatically creating an additional database table of Betfair prices does some programming, though re-usable step by step code is provided in Automatic Exchange Betting for exactly this job.

Analysing in-running comments

Saturday, May 1st, 2010

In the May edition of  Racing Ahead, Betwise take an in-depth look at analysing in- race comments in order to spot profitable betting angles – using the Smartform Racing Database.

Lots of handicappers will look up previous in-race comments for horses that they are interested in betting on.  However, using these comments is not a recognized starting point in form analysis or standardized as a way of comparing form between one horse and another.

Each race is a unique event, after all, so the story of one race is different from the story of another, and the abilities of the horses will vary.  Any number of race by race factors will also affect the way a race may be run – such as the race conditions, the going, the draw, pace in the race, how the jockeys decided to ride their mounts, how the trainers and owners instructed each jockey, to name a few.  Therefore an argument could be made that comments can’t be compared meaningfully across different events, still less as a means of measuring horses of different abilities.

Leaving aside these concerns, the sheer magnitude of the task should be enough to deter any further manual investigation.  A modest sprint handicap of 12 runners where each runner has had an average of 20 previous runs would be 240 comments to examine for one race alone, with no standard model to work towards.

So, in the Racing Ahead article we  discuss the results of analysis achieved using the flexibility and power of a programmable computer database which includes full in-race comments for each runner.  In total, we examined over 7 years’ of  in-running comments from Smartform for different race types in UK and Irish Flat racing – over 492,000 comments in total, representing over 45,000 individual races, for over 48,000 different runners.