Chester May Festival – the draw revisited
By colin on Wednesday, May 12th, 2010The draw bias at Chester racecourse, particularly over sprint distances, is a favourite example trend, simply because it is so pronounced when analyzed quantitatively.
Blindly backing any horse drawn in stall 1 and stall 2 in sprint races (defined as being over 5 and 6 furlongs) in larger fields (over 10 runners) has produced a consistent profit at starting price (and even more at Betfair SP) over the past few years. Even in smaller fields and over longer distances, the first starting point for analyzing races at Chester should be the draw, though it’s always worth revisiting any assumption, especially when there is new information.
Last week at Chester, there were only 2 races qualifying with a larger number of runners (10+) over minimum distances. For stalls drawn 1 and 2 these fared as follows:
+---------------------+----------------+-------+------+-------------+--------+
| scheduled_time | winner | stall | SP | num_runners | result |
+---------------------+----------------+-------+------+-------------+--------+
| 2010-05-05 15:15:00 | Look Busy | 2 | 9.00 | 13 | 4 |
| 2010-05-05 15:15:00 | Royal Intruder | 1 | 8.00 | 13 | 7 |
| 2010-05-06 16:30:00 | Tasmeem | 1 | 9.00 | 11 | 5 |
| 2010-05-06 16:30:00 | Rule Of Nature | 2 | 3.00 | 11 | 2 |
+---------------------+----------------+-------+------+-------------+--------+
The best result any one of these could manage was second – and that was for the Michael Stoute trained Rule of Nature, which went off at a short price indeed and should have had a lot more going for its chances than the draw alone.
If we extend our survey to 7 furlong races, we find one more qualifier which produces a winner as follows:
+----------+---------------------+-----------------+-------+-------+-------------+--------+
| distance | scheduled_time | winner | stall | SP | num_runners | result |
+----------+---------------------+-----------------+-------+-------+-------------+--------+
| 1100 | 2010-05-05 15:15:00 | Look Busy | 2 | 9.00 | 13 | 4 |
| 1100 | 2010-05-05 15:15:00 | Royal Intruder | 1 | 8.00 | 13 | 7 |
| 1320 | 2010-05-06 16:30:00 | Tasmeem | 1 | 9.00 | 11 | 5 |
| 1320 | 2010-05-06 16:30:00 | Rule Of Nature | 2 | 3.00 | 11 | 2 |
| 1540 | 2010-05-07 16:30:00 | Lucky Numbers | 1 | 5.50 | 12 | 3 |
| 1540 | 2010-05-07 16:30:00 | Dance And Dance | 2 | 11.00 | 12 | 1 |
+----------+---------------------+-----------------+-------+-------+-------------+--------+
Just the one winner from the extra 7 furlong race produces sufficient returns, even at SP, to cover blind faith in the draw advantage alone, but clearly more analysis is needed, even at Chester. Whilst it is obvious to anyone who has seen the Roodee that the draw advantage gives a significant edge to any runner racing on towards the inside rail, there are other factors as well as the draw at work to enable runners to get to the inside rail – and to secure that advantage. Not least is the ability of a horse to break and lead early. Whilst the effects of the draw are important, the proportion of front runners who win sprint races is equally compelling as we discuss in our analysis of front runners in this month’s Racing Ahead. Combine a front runner at Chester with any stall position that gives it the ability to cross to the rail early, and you have a powerful combination to give that horse a winning edge – especially, in the case of stall position, if the horses drawn on the inside are less capable front runners. So what did win the two larger field sprints at Chester last week?
+---------------------+-------------+-------+-------+-------------+--------+
| scheduled_time | winner | stall | SP | num_runners | result |
+---------------------+-------------+-------+-------+-------------+--------+
| 2010-05-05 15:15:00 | Masamah | 3 | 10.00 | 13 | 1 |
| 2010-05-06 16:30:00 | Horseradish | 6 | 3.75 | 11 | 1 |
+---------------------+-------------+-------+-------+-------------+--------+
Masamah still had an excellent draw in stall 3, a history of running from the front, and indeed ran as follows:
made all, ridden over 1f out, stayed on well final furlong
a running style that fits the hypothesis well.
In the case of Horseradish, he raced on softer ground than normal, and was able to track the leaders and still win, as follows:
tracked leaders, headway to lead over 1f out, ridden and stayed on well final furlong
Clearly relative ability will always enable horses to win races whatever their draw, though being drawn 6 of 11 on softer ground was not a huge disadvantage.
We’ll be using the Smartform database to produce more analyis of this sort over the coming months, which combines both in-running styles and draw analysis.
Comparing horses from different sources – the solution
By colin on Friday, May 7th, 2010In yesterday’s post we discussed the problem of using information from different data sources for research and automated betting, where the name of the horse differs according to the data source.
The most common problems are incorrect capitalization within horse names (eg. Sea the Stars instead of Sea The Stars) and omission, misplacement and other misdemeanors with apostrophes.
Various programming solutions are presented to this problem in Automatic Exchange Betting, but there is an even simpler solution where one of the information sources is the Smartform Racing Database. Firstly, we know the runner names in Smartform are correct, so we can use this as our master source. Secondly, Smartform uses straightforward SQL, which provides for many basic operations on character strings, such as conversion to lower case and pattern matching (so that we have access to search and replace functions).
This means that we can easily convert a horse name in Smartform to an equivalent name which has no capitals, no whitespace and no apostrophes. If we do the same transformation on the name from the target data source, our correct name will match our incorrect name, and we can start to use information from both data sources in our betting strategy. If we want to keep the correct name, we just select the correct name to be displayed but match the information on the transformed names.
If you’re not familiar with these functions in MySQL, you can download a copy and test this sort of functionality out easily without selecting any database, as in the queries below:
#Sea The Stars would normally be a variable in the database, so the query would not need quotations around the name.
;
>select lower("SEA THE STARS");
This will produce the name sea the stars.
Or use the replace functions to produce a name without white space (which also applies to apostrophes):
#The replace function takes three arguments separated by commas - the string to transform, the elements to replace, and the string to replace it with, as in:
;
>select replace("SEA THE STARS", ' ', '');
which produces the name SEATHESTARS
#Put the above functions together within one statement to produce a horse name that can be matched against another without issues:
;
>select replace("SEA THE STARS", ' ', '');
So at last we get seathestars.
If you’re unfamiliar with SQL, the syntax can take a little getting used to, but on the whole is a gentler introduction than learning a programming language – and allows you to achieve an awful lot when it comes to horseracing analysis.
Performing the same operation on the target horse name in another database table let’s us match data up between horses using a table join without leaving the database. Returning to our original example from yesterday, this means, for example, we could match any form or forecast odds data in Smartform with any market data available in Betfair. Of course, automatically creating an additional database table of Betfair prices does some programming, though re-usable step by step code is provided in Automatic Exchange Betting for exactly this job.
Comparing horses from different sources – the problem
By colin on Thursday, May 6th, 2010A recurring problem in developing automated betting strategies is accounting for differences in horse names from different data sources, when in fact each source is referring to the same animal.
We discuss the logic behind betting strategies that use different sources in Automatic Exchange Betting. In summary, an automated betting strategy may require various inputs that are only available from multiple data sources, – just as a manual betting strategy does. For example, one data source may contain a horse’s form, another may contain current exchange or bookmaker prices for the horse, and another a news feed we want to scan for information on a specific horse before betting.
This problem isn’t just limited to programming betting robots, it also applies to basic research – for example collecting and retrieving Betfair market prices for any given horse name, when the horse name you want to fetch prices for does not come from Betfair to start with.
In fact, the Betfair case is the most frequent issue that we deal with in automatic betting. Take a few examples from today’s racing:
Raddy ‘ell Pauline runs in the 4.30 at Chester, Mioche d’Estruval runs in the 5.25 at Newton Abbot, What’s Occurrin runs in the 6.50 at Wetherby, and Mandy’s Princess runs in the 3.55 at Chester. These horses are listed in Betfair as Raddy ell Pauline, Mioche DEstruval, Whats Occurrin, and Mandys Princess, respectively.
Spot the problem? In most cases, Betfair simply misses the apostrophe from horses’ names as a matter of policy. Occasionally there are also capitalization problems, as with the Betfair rendition of Mioche d’Estruval above.
Let’s imagine that these four horses came from an automated selection list produced by Smartform (which lists all the horse names correctly, ie. as they were registered by their owners). We now want our betting robot to use the Betfair API to retrieve prices for each horse, and if those prices meet a certain minimum, we want to bet on each horse.
Unfortunately, if we simply present the correct horse names to our betting program we will be in trouble – the Betfair API won’t recognize them. We’ll get neither the prices we asked for, nor will we be able to bet on these horses – or do anything else with the Betfair API for these runners unless we take some action first.
Fortunately there are a number of simple approaches to resolving this, the easiest of which can be done within Smartform without resorting to using a programming language at all – more on this tomorrow.
Ratings, the 1000 Guineas and Seta
By colin on Sunday, May 2nd, 2010One of the strengths of the Betwise approach is using performance driven ratings models for a sustained betting edge over the long haul. But on an individual race basis, it is just as useful to know the limitations of conventional models.
The 1000 Guineas today is a classic case (pun realised) in point.
Many of the leading contenders come to the Fillies’ Classic today unraced since their 2 year old careers. Yet recent form is generally important in races where we look to rely on past performance. Of those that are unraced this season, Seta, Pollenator and Hibaayeb are three particularly interesting contenders – particularly interesting because their level of 2 year old form was already high and each of their trainers knows exactly what it takes to get a horse ready to win a classic first time up.
The contenders who have raced this season are all bound to improve at a rapid rate from their debut runs (as befits a 3 year old thoroughbred), and each one of their trainers will have had today’s race in mind to bring them to peak fitness, rather than their trial races on which we tend to rely for evaluating previous performance. Add these factors together, and you have a big puzzle about improvement which is hard to solve.
To make matters worse, the scant form there is to go on comes at varying trips, often short of a mile, on varying ground conditions. By all accounts, the current ground conditions at Newmarket are on the soft side of good, though it will be interesting to listen to the shrewder jockeys after the first race and consult the times – ground conditions may have a big influence.
Given that this is such a big puzzle, it’s perhaps a race to avoid from a betting point of view – we want to have a very good idea of our exact edge when betting, not a very good idea that we are facing a big puzzle. However, it’s impossible to resist trying to solve a big puzzle, even if it is possible to resist betting on the outcome of it.
So, having been a little unfair in suggesting that it was time to throw the form book out of the window, here’s a well considered guess using the data to hand. It’s conceivable to see many contenders winning having seen so little to date, but on ratings acheived over known past (and recent) form, the French filly Special Duty and the Mick Channon trained Music Hill come out very well. Seta does not come out well. Read more…
Analysing in-running comments
By colin on Saturday, May 1st, 2010In the May edition of Racing Ahead, Betwise take an in-depth look at analysing in- race comments in order to spot profitable betting angles – using the Smartform Racing Database.
Lots of handicappers will look up previous in-race comments for horses that they are interested in betting on. However, using these comments is not a recognized starting point in form analysis or standardized as a way of comparing form between one horse and another.
Each race is a unique event, after all, so the story of one race is different from the story of another, and the abilities of the horses will vary. Any number of race by race factors will also affect the way a race may be run – such as the race conditions, the going, the draw, pace in the race, how the jockeys decided to ride their mounts, how the trainers and owners instructed each jockey, to name a few. Therefore an argument could be made that comments can’t be compared meaningfully across different events, still less as a means of measuring horses of different abilities.
Leaving aside these concerns, the sheer magnitude of the task should be enough to deter any further manual investigation. A modest sprint handicap of 12 runners where each runner has had an average of 20 previous runs would be 240 comments to examine for one race alone, with no standard model to work towards.
So, in the Racing Ahead article we discuss the results of analysis achieved using the flexibility and power of a programmable computer database which includes full in-race comments for each runner. In total, we examined over 7 years’ of in-running comments from Smartform for different race types in UK and Irish Flat racing – over 492,000 comments in total, representing over 45,000 individual races, for over 48,000 different runners.
Finding winners automatically
By colin on Thursday, April 29th, 2010Automating the betting process was possible for some time before the emergence of the Betfair API and writing Automatic Exchange Betting, but making the process reliable was a challenge. Betfair’s API created a robust way to programmatically access market data and place bets via the exchange (as opposed to a web scraping approach), but there was still no good way to automate the selection of bets themselves. This required unreliable and/or manual processes to either export data from one of the traditional interactive racing databases, such as Raceform Interactive, or to write robots to scrape the web from online form sources (which was unsatisfactory for various reasons – grey area of site usage, changing page formats, incomplete data, to name a few).
So, to make the selections part of automating the betting process more robust, Betwise created the Smartform database before publication of the book by licensing copious racing data for Members’ personal use back in 2007, designing it for automated updates from original sources, and making it as easy as possible to create and run programs to do just about any aspect of form analysis and output selections for automated betting; no manual ‘data exports’ necessary.
Along with the Betfair API, the service completed the DIY betting automation picture. For sure, programming is not everyone’s cup of tea, but if a bettor has a manual betting process that can be well described, it is a good candidate for automation since any good betting strategy, automated or otherwise, begins with data.
An example I used in a magazine article just before the book was published illustrated just how simple the principles of automated betting can be. We looked at a straightforward case that can be considered at one particular racecourse to show how even analysis of a single variable could be turned into a useful automated strategy for certain types of races. For a more general approach to all races, of course, it makes sense to look at more sophisticated models for predicting performance which use multiple factors.
Draw and Pace: Chester 2.55, Saturday
By colin on Saturday, May 22nd, 2010Today’s Lambs Navy Rum Handicap at Chester is an inauspicious affair – a class 5 handicap worth £4047 to the winner.
However, it’s of particular interest to us because the race throws up a chance to apply the analysis we’ve been looking at recently with regard to draw and pace in sprint races.
Chester is our favourite example for draw bias, and we’ve done lots of research which shows the continued profitability of backing stalls 1 and 2 blind in larger fields over 5 furlongs. In fact, it’s a trend you can make an automatic profit with over the long run, since it is usually underbet – though of course the prices that make this true can change in future.
What will not change is the natural advantage handed to horses drawn low. As we saw earlier in the month at the May festival, an analysis of front runners can also help determine who will get to the rail early and stay there. Speedy sorts can overcome the natural advantage of stalls 1 and 2 by beating those runners to the rail – as in the case of Masamah who made all to win from stall 3 in a 5 furlong sprint at the May festival – though we won’t bother to look at anything drawn higher than halfway in this 12-runner field, so we discount anything higher than 6.
Betwise use a front runner prediction method derived from in-running comments in the Smartform database, that we will be describing in detail in the June edition of Racing Ahead. Applying this method to the 2.55 today, our top 6 (in order) for those most likely to break early and lead from the front are:
Of those drawn in the top 6, only Memphis Man (drawn 2) and Radiator Rooney (drawn 6) do not make the cut as previous front runners that rank as likely to lead, though no doubt their jockeys will/ should do everything to encourage them. Red Rosanna and Baby Queen, despite being in our top 6 ranked front runners, will be discounted since they are both drawn higher than 6. Which leaves us with a shortlist of 4 – Not My Choice, Harry Up, Legal Eagle and Sir Geoffrey in the 12 runner field, before looking at any individual horse’s recent form or ability.
We still think that draw is the most important factor at Chester (meaning we’d be reluctant to go against Not My Choice in stall 1), though the top ranked front running ability of Harry Up may be enough to overcome his poorer draw in stall 5. At this point in the game it’s time for individual choice and weighing up each horse’s potential to win against its price (at prices of 7.6 and 8.8 for the two horses mentioned, you can make a reasonable argument for value already). Whatever the individual bettor’s view, discounting over 60% of the field makes that task much easier.
Tags: Chester, draw advantage, front runners, Harry Up, in race comments, Not My Choice, pace, Smartform
No Comments (add your own) »