Betwise Blog
Betwise news, analysis and automatic betting info

Comparing horses from different sources – the solution

By colin on Friday, May 7th, 2010

In yesterday’s post we discussed the problem of using information from different data sources for research and automated betting, where the name of the horse differs according to the data source.

The most common problems are incorrect capitalization within horse names (eg.  Sea the Stars instead of Sea The Stars) and omission, misplacement and other misdemeanors with apostrophes.

Various programming solutions are presented to this problem in Automatic Exchange Betting, but there is an even simpler solution where one of the information sources is the Smartform Racing Database.  Firstly, we know the runner names in Smartform are correct, so we can use this as our master source.   Secondly, Smartform uses straightforward SQL, which provides for many basic operations on character strings, such as conversion to lower case and pattern matching (so that we have access to search and replace functions).

This means that we can easily convert a horse name in Smartform to an equivalent name which has no capitals, no whitespace and no apostrophes.  If we do the same transformation on the name from the target data source, our correct name will match our incorrect name, and we can start to use information from both data sources in our betting strategy.  If we want to keep the correct name, we just select the correct name to be displayed but match the information on the transformed names.

If you’re not familiar with these functions in MySQL, you can download a copy and test this sort of functionality out easily without selecting any database, as in the queries below:

#Sea The Stars would normally be a variable in the database, so the query would not need quotations around the name.
>select lower("SEA THE STARS");
;

This will produce the name sea the stars.

Or use the replace functions to produce a name without white space (which also applies to apostrophes):

#The replace function takes three arguments separated by commas - the string to transform, the elements to replace, and the string to replace it with, as in:
>select replace("SEA THE STARS", ' ', '');
;

which produces the name SEATHESTARS

#Put the above functions together within one statement to produce a horse name that can be matched against another without issues:
>select replace("SEA THE STARS", ' ', '');
;

So at last we get seathestars.

If you’re unfamiliar with SQL, the syntax can take a little getting used to, but on the whole is a gentler introduction than learning a programming language – and allows you to achieve an awful lot when it comes to horseracing analysis.

Performing the same operation on the target horse name in another database table let’s us match data up between horses using a table join without leaving the database.  Returning to our original example from yesterday, this means, for example, we could match any form or forecast odds data in Smartform with any market data available in Betfair.  Of course, automatically creating an additional database table of Betfair prices does some programming, though re-usable step by step code is provided in Automatic Exchange Betting for exactly this job.

No Comments (add your own) »

Comparing horses from different sources – the problem

By colin on Thursday, May 6th, 2010

A recurring problem in developing automated betting strategies is accounting for differences in horse names from different data sources, when in fact each source is referring to the same animal.

We discuss the logic behind betting strategies that use different sources in Automatic Exchange Betting.   In summary, an automated betting strategy may require various inputs that are only available from multiple data sources, – just as a manual betting strategy does.  For example, one data source may contain a horse’s form, another may contain current exchange or bookmaker prices for the horse, and another a news feed we want to scan for information on a specific horse before betting.

This problem isn’t just limited to programming betting robots, it also applies to basic research – for example collecting and retrieving Betfair market prices for any given horse name, when the horse name you want to fetch prices for does not come from Betfair to start with.

In fact, the Betfair case is the most frequent issue that we deal with in automatic betting.    Take a few examples from today’s racing:

Raddy ‘ell Pauline runs in the 4.30 at Chester, Mioche d’Estruval runs in the 5.25 at Newton Abbot, What’s Occurrin runs in the 6.50 at Wetherby, and Mandy’s Princess runs in the 3.55 at Chester.   These horses are listed in Betfair as Raddy ell Pauline, Mioche DEstruval, Whats Occurrin, and Mandys Princess, respectively.

Spot the problem?   In most cases, Betfair simply misses the apostrophe from horses’ names as a matter of policy.   Occasionally there are also capitalization problems, as with the Betfair rendition of Mioche d’Estruval above.

Let’s imagine that these four horses came from an automated selection list produced by Smartform (which lists all the horse names correctly, ie. as they were registered by their owners).   We now want our betting robot to use the Betfair API to retrieve prices for each horse, and if those prices meet a certain minimum, we want to bet on each horse.

Unfortunately, if we simply present the correct horse names to our betting program we will be in trouble – the Betfair API won’t recognize them.   We’ll get neither the prices we asked for, nor will we be able to bet on these horses – or do anything else with the Betfair API for these runners unless we take some action first.

Fortunately there are a number of simple  approaches to resolving this, the easiest of which can be done within Smartform without resorting to using a programming language at all – more on this tomorrow.

No Comments (add your own) »

Ratings, the 1000 Guineas and Seta

By colin on Sunday, May 2nd, 2010

One of the strengths of the Betwise approach is using performance driven ratings models for a sustained betting edge over the long haul.  But on an individual race basis, it is just as useful to know the limitations of conventional models.

The 1000 Guineas today is a classic case (pun realised) in point.

Many of the leading contenders come to the Fillies’ Classic today unraced since their 2 year old careers.  Yet recent form is generally  important in races where we look to rely on past performance.  Of those that are unraced this season, Seta, Pollenator and Hibaayeb are three particularly interesting contenders – particularly interesting because their level of 2 year old form was already high and each of their trainers knows exactly what it takes to get a horse ready to win a classic first time up.

The contenders who have raced this season are all bound to improve at a rapid rate from their debut runs (as befits a 3 year old thoroughbred), and each one of their trainers will have had today’s race in mind to bring them to peak fitness, rather than their trial races on which we tend to rely for evaluating previous performance.  Add these factors together, and you have a big puzzle about improvement which is hard to solve.

To make matters worse, the scant form there is to go on comes at varying trips, often short of a mile, on varying ground conditions.  By all accounts, the current ground conditions at Newmarket are on the soft side of good, though it will be interesting to listen to the shrewder jockeys after the first race and consult the times – ground conditions may have a big influence.

Given that this is such a big puzzle, it’s perhaps a race to avoid from a betting point of view – we want to have a very good idea of our exact edge when betting, not a very good idea that we are facing a big puzzle.  However, it’s impossible to resist trying to solve a big puzzle, even if it is possible to resist betting on the outcome of it.

So, having been a little unfair in suggesting that it was time to throw the form book out of the window, here’s a well considered guess using the data to hand.  It’s conceivable to see many contenders winning having seen so little to date, but on ratings acheived over known past (and recent) form,  the French filly Special Duty and the Mick Channon trained Music Hill come out very well.  Seta does not come out well. Read more…

No Comments (add your own) »

Analysing in-running comments

By colin on Saturday, May 1st, 2010

In the May edition of  Racing Ahead, Betwise take an in-depth look at analysing in- race comments in order to spot profitable betting angles – using the Smartform Racing Database.

Lots of handicappers will look up previous in-race comments for horses that they are interested in betting on.  However, using these comments is not a recognized starting point in form analysis or standardized as a way of comparing form between one horse and another.

Each race is a unique event, after all, so the story of one race is different from the story of another, and the abilities of the horses will vary.  Any number of race by race factors will also affect the way a race may be run – such as the race conditions, the going, the draw, pace in the race, how the jockeys decided to ride their mounts, how the trainers and owners instructed each jockey, to name a few.  Therefore an argument could be made that comments can’t be compared meaningfully across different events, still less as a means of measuring horses of different abilities.

Leaving aside these concerns, the sheer magnitude of the task should be enough to deter any further manual investigation.  A modest sprint handicap of 12 runners where each runner has had an average of 20 previous runs would be 240 comments to examine for one race alone, with no standard model to work towards.

So, in the Racing Ahead article we  discuss the results of analysis achieved using the flexibility and power of a programmable computer database which includes full in-race comments for each runner.  In total, we examined over 7 years’ of  in-running comments from Smartform for different race types in UK and Irish Flat racing – over 492,000 comments in total, representing over 45,000 individual races, for over 48,000 different runners.

Read more…

No Comments (add your own) »

Finding winners automatically

By colin on Thursday, April 29th, 2010

Automating the betting process was possible for some time before the emergence of the Betfair API and writing Automatic Exchange Betting, but making the process reliable was a challenge.  Betfair’s API created a robust way to programmatically access market data and place bets via the exchange (as opposed to a web scraping approach), but there was still no good way to automate the selection of bets themselves.  This required unreliable and/or manual processes to either export data from one of the traditional interactive racing databases, such as Raceform Interactive, or to write robots to scrape the web from online form sources (which was unsatisfactory for various reasons – grey area of site usage, changing page formats, incomplete data, to name a few).

So, to make the selections part of automating the betting process more robust, Betwise created the Smartform database before publication of the book by licensing copious racing data for Members’ personal use back in 2007, designing it for automated updates from original sources, and making it as easy as possible to create and run programs to do just about any aspect of form analysis and output selections for automated betting; no manual ‘data exports’ necessary.

Along with the Betfair API, the service completed the DIY betting automation picture.   For sure, programming is not everyone’s cup of tea, but if a bettor has a manual betting process that can be well described, it is a good candidate for automation since any good betting strategy, automated or otherwise, begins with data.

An example I used in a magazine article just before the book was published illustrated just how simple the principles of automated betting can be.  We looked at a straightforward case that can be considered at one particular racecourse to show how even analysis of a single variable could be turned into a useful automated strategy for certain types of races.  For a more general approach to all races, of course, it makes sense to look at more sophisticated models for predicting performance which use multiple factors.

Read more…

No Comments (add your own) »