Posts Tagged ‘Smartform’

Draw analysis at Sandown today

Saturday, July 3rd, 2010

As per last Saturday’s post we shift our attention in July to assessing the effect of the draw in upcoming races.  To do this we are using an automated model created with Smartform, applying the principles laid out in the Betwise article in this month’s Racing Ahead.

Focus today is on the first race at Sandown, which requires analysis of the 5 furlong straight course in the middle of the track.  As with front runners, draw bias tends to show up in results most strongly at races under a mile.  Although there are notable exceptions to this rule (as with the case of the Ebor draw bias over 1 mile 6 furlongs at York), it usually pays to concentrate on sprint distances.

Sandown exposes a weakness in some draw analysis you can find, which is to concentrate on strike rate per stall, without factoring in stall positions.  You can read a lot more about this in our article, but sticking with the case of Sandown, as any regular Sandown racegoer knows, the stalls are usually positioned with the highest number against the far rail and represent a position – though not necessarily a stall! – that has held a continuing advantage over the years.  The actual stall number drawn against the far rail varies according to the size of the field, making strike rate per stall statistics more or less redundant.  The way to overcome this is to use historic data to map the advantage of the position on the track which each stall occupied (though even here, there are always problems presented by rail movements).

Our Smartform model above maps the previous advantage of each course position onto all today’s stall numbers, as if you are looking overhead at the race about to start – stall 7 is drawn against the far rail.  The height of the bar represents stall advantage, with anything over 1 indicating higher than expected winners and under 1 a negative expectation.  (Stall 5 should be empty today due to the withdrawal of Wi Dud – draw 4 may be shifted one up as a result).

What conclusions to draw from this? Though we can see immediately that the draw advantage in small fields does not exclude the possibility of any runner winning, there is a distinct negative from being drawn in stalls 1, 2 and 3.  Combined with front runner analysis, we can also see which, if any, of these contenders are able to break early to secure a better position (front runner analysis for this race is posted in the Betwise Members’ area and freely available).  Going back to the commonly held belief that the highest stall is the place to be at Sandown, we can see that whilst this is born out by our Smartform model, it is very marginal for small fields, and there is, for example, almost as big an advantage being drawn in stall 4 today.  So is Bould Mover, in stall 4, better value than Triple Aspect, the hot favourite drawn on the far rail in stall 7?  Does Reignier, in stall 3, represent value at 14/1, given that stall 3 still produces a reasonable number of winners in small fields?  Of course, answering these questions relies on far more than knowing the runners’ stall position – we need to know more about the ability of the horses concerned, but at least we can now put the effect of the draw in its place.

From front runners to the draw – Chester races today

Saturday, June 26th, 2010

Today is the last of our series of Saturday posts looking at probable front runners using the Smartform model that we outlined in this month’s edition of Racing Ahead.  Not to say that we won’t be looking at front runners again, but not in every post 😉  Selected front runner ratings will also start appearing in the Betwise Members’ area (for free) from next Saturday – this analysis has predicted a few good winners as well as front runners in each of the past 5 Saturdays.

Our upcoming July article in Racing Ahead looks at using Smartform to calculate the effects of the draw at every track.  In fact, as we have discussed before, an advantageous draw combined with a front running style is a powerful winning combination, nowhere moreso than Chester, one of today’s meetings.

Not that we need a sophisticated model to assess the effects of the draw at Chester – low is best, period.  In sprint races with larger fields, the advantage is even more pronounced, since many of the runners are forced to race wide.

At Chester today there are 3 races under a mile, 2 sprints at 5 furlongs and one at the awkward “sprint” distance of 7 furlongs.  The first two races have smaller field sizes, so it’s possible that the plum running positions on the inside rail may be more easily occupied by horses that are not drawn lowest.  Where there are smaller field sizes, it always raises an interesting question of how much importance we should attach to the draw versus running style in terms of predicting leaders.  The last race is 10 runners over the minimum distance – typically horses drawn 1 and 2 (today this is Ryan Style and Hoh Hoh Hoh) in such races can carry all before them – even though these do not come in our top 3 predicted front runners judged on pure running style alone.  For all today’s races, you can be the judge – below are the front runner ratings coupled with the draw position of every horse for each of the sprint races at Chester:

RACE: 2010-06-26, Chester, 14:05:00, 1116 Yds

HORSE, % LEADING CHANCE, DRAW, SMARTFORM FORECAST SP:
Lord Avon, 0.31, 5, 7/2
Coconut Ice, 0.21, 7, 7/1
Fred Willetts, 0.21, 4, 10/1
Scarlet Rocks, 0.14, 9, 11/4
Triple Agent, 0.14, 6, 25/1
Lexi’s Hero, 0.00, 3, 7/1
Leiba Leiba, 0.00, 2, 8/1
The Thrill Is Gone, 0.00, 1, 5/1

RACE: 2010-06-26, Chester, 15:45:00, 1542 Yds

HORSE, % LEADING CHANCE, DRAW, SMARTFORM FORECAST SP:
Rule Breaker, 0.41, 1, 7/2
Cansili Star, 0.23, 2, 11/8
Below Zero, 0.17, 7, 12/1
William Morgan, 0.15, 4, 10/1
Tiradito, 0.04, 6, 14/1
Layla’s Hero, 0.00, 5, 6/1
Hunting Tartan, 0.00, 3, 7/1

RACE: 2010-06-26, Chester, 16:20:00, 1116 Yds

HORSE, % LEADING CHANCE, DRAW, SMARTFORM FORECAST SP:
Falasteen, 0.32, 7, 8/1
Bertoliver, 0.23, 4, 5/2
Lost In Paris, 0.17, 9, 4/1
Hoh Hoh Hoh, 0.11, 2, 16/1
Ryan Style, 0.08, 1, 6/1
Grissom, 0.04, 10, 14/1
Lucky Dan, 0.03, 8, 6/1
Memphis Man, 0.01, 5, 10/1
Green Park, 0.01, 3, 14/1
Dancing Red Devil, 0.00, 6, 25/1

Making the running in the Golden Jubilee and the Wokingham

Saturday, June 19th, 2010

Today’s big sprint races at Ascot over 6 furlongs present an interesting challenge for our Smartform front runner ratings.

The first thing to notice about both races is the enormous field size, with 24 runners in the Golden Jubilee due off at 3.50, and 27 runners in the Wokingham, due off at 4.25.  When there are so many runners lined up across the track, not only can sprint races look like a cavalry charge, but the chances of successfully predicting the relative behaviour of any one contender is of course less.  Fortunately, the prices on offer are that much bigger, too.

Despite the highly competitive nature of the race, the Golden Jubilee throws up quite some discrepancy between front running attributes.  In such a big field, we would expect the variation between the top ranked contenders to be marginal, but this is not the case.  Here are the rankings for the top 3 – showing the relative percentage chance of each leading in the first half of the race, followed by its draw.

Sayif, 0.30, 6
War Artist, 0.16, 23
Showcasing, 0.11, 7

So, Sayif is almost twice as likely to lead as the nearest contender.  However, it’s not so straightforward with these rankings, since both Sayif and Showcasing are also horses that exhibit some lagging tendencies in some of their previous races, earning them a high score on both fronts (the lagging percentages are not shown here).  War Artist does not score any lagging points, but falls some way behind the raw score of Sayif.

With Sayif drawn towards the stands rail, and War Artist drawn on the far rail, they would be the two picks, with the stands rail pick slightly favoured, also at a bigger price of 48.0 on Betfair at the time of writing.  With a good chance that Sayif will race prominently (and even if he does not, he is a quality colt with a winning chance), a price of 48.0 in a liquid market presents definite back to lay possibilities.

On to the Wokingham at 4.25, there is less disparity between the leading contenders in terms of ratings – here is how they fall:

Masamah, 0.10, 7
Edge Closer, 0.08, 1
Evens And Odds, 0.07, 16

Masamah is passed over in terms of converting an early lead to a winning advantage, since it has shown most promise to date over 5 furlongs.  However, we think it is likely to race prominently and will show well towards the stands rail for the first few furlongs, so may show some odds reduction in running from its current price of 85.0.  The other two contenders are preferred in terms of horses that may race prominently and convert that edge into a winning chance.   Edge Closer’s chance is probably reflected in its odds of 23.0 for the time being, but Evens and Odds, ridden by William Buick, who guided last week’s front running pick, Burning Thread, to win the front in the big sprint at Sandown, is an interesting contender who may race prominently at a big price, currently at 44.0 on Betfair.

Who will make the running in the big sprint at Sandown?

Saturday, June 12th, 2010

Today’s front runner analysis using our Smartform model focuses on the richest race on the card at Sandown – the listed Scurry Stakes over 5 furlongs, due off at 3.30 and worth over 22K to the winner alone.

Without further ado, here are the results of today’s leader analysis for this race (percentage ranked chance of leading early, followed by stall positions):

Burning Thread, 0.27, 9
Above Limits, 0.21, 2
Red Avalanche, 0.16, 6
Duchess Dora, 0.12, 3
Reignier, 0.12, 1
Lady Brickhouse, 0.06, 5
Tawaabb, 0.06, 8
Duplicity, 0.00, 7
Diamond Johnny G, 0.00, 4

The usual caveats apply to the raw numbers – there is no measure of ability, suitability to conditions, or any individual measure of form (other than analysis of the previous running style of each contender) used in the production of the ratings.

Another caveat is that today’s race includes runners with little historic form, being limited to 3 year old competitors.  This last observation applies especially to our top ranked leading contender  – Burning Thread.  He’s had only 3 runs in total and was slowly away on the first of them, meaning he also scores as a potential lagger.  However, we’re prepared to forgive his debut run last year, since his last 2 outings show him to be a useful, pacey sort.

As we look at the next factor of interest from a pace perspective – the draw – another positive for Burning Thread emerges, in that he is drawn in stall 9 (of 9).  Traditionally the draw at Sandown on the 5 furlong track in the middle of the course favours those drawn against the far running rail.  Relying on a so-called “known” draw bias can be suspect (unless the reason such bias is hard to counter as at our favoured example of the inside rail at Chester) especially as clerks of the course may seek to mitigate such advantage on straight courses through watering policies and the like.  In such cases it pays to look at recent evidence in the draw, something we’ll be focusing on as a topic in its own right over the coming months.  For the case in hand at Sandown today, we will assume that the rail draw is no negative, and may well provide extra assistance, despite the field size being relatively small.

Let’s say Burning Thread takes a prominent racing position from the rail draw – is he good enough to win?  That is more doubtful.  His ability ratings are not the best in the field, and in this class he may face stiff competition in the closing stages.  If the favourite, Duchess Dora, is close up (as befits her running style), he will be in danger.  There may be a back to lay opportunity, as his price currently stands at 5.3 on Betfair (as of the time of writing, at 9 am), which should be shorter if he is a genuine contender within the final furlong.  At a much bigger price, 12.0, Red Avalanche is also interesting in stall 6 – but we have to take on trust his comparative ability as a 3 yr old since he has been off the track for 245 days.  Whilst he also raced keenly as a juvenile, we cannot really predict what his running style may be without more recent evidence.  All in all, this a tricky affair with so many unknowns to factor – but, of course, that’s one of the things that makes racing fun.

Derby Day Front Runners

Saturday, June 5th, 2010

With the most interesting race of the day, the Epsom Derby, set to go off at 4 pm, we turn out attention to front runner analysis in a couple of the earlier sprints being run before the big race.  This continues June’s Saturday analysis theme using our Smartform model as outlined in this month’s Racing Ahead article and previous posts.

First up, the Epsom 2.10.  Not much evidence to go on, since this is listed race for 2 year olds, most of whom have raced only once or twice.  As a race type it ranks amongst the most unreliable for the ratings.  Two reasons for this – firstly, a higher class, non-handicap race means a potentially wide ability gulf between the runners rendering our predictions based on previous races redundant.  Second, each runner has had little chance to establish a real profile.   However, from the evidence we have seen, there may be an angle in the ratings.  Here are the top two (percentage leader prediction, followed by draw):

Dubawi Gold, 0.24, 8
Where’s Romeo, 0.24, 2

A tie for top ranked front runner does not look too promising, but the third rated, Singapore Lilly, rates only 12% likely to lead and likelier to start slowly, leaving Premier Clarets rated fourth, also at 12%.   So as a starting point, we can say there is stronger than 50% chance that Dubawi Gold and Where’s Romeo will break and try to lead early.  Of the two, Where’s Romeo has raced twice over the minimum trip and led, whereas Dubawi Gold has raced once over 6 furlongs and led – so, Where’s Romeo’s early speed may be stronger.  Also, Where’s Romeo is drawn towards the inside rail, which is generally an advantage over 6 furlongs at Epsom.  Last but not least, there is better price margin in Where’s Romeo’s price for an in-running play.  Currently available at 8.2 on Betfair, we’d expect it to trade much lower if has been leading after the first few furlongs.   Too many form and ability unknowns to try and call the winner, however.

Next up, the top two from the Musselburgh 2.35.  All the caveats on race type mentioned above apply, since it is also a 2 year old race with little previous form to go on.

Excel Bolt, 0.43, 4
Misty Morn, 0.22, 7

The ratings speak for themselves on this one, we should not look outside these two in order to try and predict the front runner.  However, Misty Morn has a rag’s chance and is rated just as likely to start slowly.  Excel Bolt has one run to his name and is also the favourite for the race.  Not much margin for an in-running play, with the price already at 2.12 on Betfair.  However, if Excel Bolt breaks from the front over 5 furlongs at Musselburgh, he will take all the beating.

Last but not least, we should say a word about the most valuable sprint of the day, the so-called  Dash or Epsom 3.15.  Here are the top two:

Le Toreador, 0.11, 3
Glamorous Spirit, 0.11, 1

We’ve left this to last, because these ratings are not the strongest. Basically the field is full of high class sprinters, most of whom are capable of breaking well. However, there are some nice prices on our top two, so whilst it could not be a strong fancy, Glamorous Spirit also has a hitherto spotless lagger record, and is therefore worthy of further consideration at 38.0 at the time of writing.

Front Runner analysis: 4.15 Beverley today

Saturday, May 29th, 2010

As promised in this month’s Racing Ahead article, Betwise are previewing different races here every week to shortlist each contender’s probability of being the front runner in the race.

The first question a good cynic should ask is: What’s the point of trying to predict front runners?  Here are a few reasons:

  1. In races under a mile, any runner that is prominent early has a c. 30% chance of winning the race.
  2. It is often possible to back probable front runners before the race and lay them off in running at a profit.
  3. Predicting the likeliest front runners is key to pace handicapping, and working out draw advantage.

All our front runner rankings are produced automatically, using the Smartform Database, looking into every runners’ previous history and assessing their running styles against each other for the race in question.  We’ve found that the method has the best record of success in smaller fields over sprint distances – so we’re picking out the 4.15 at Beverley as today’s race.

Below are the list of likely slowly starters (LAGGERS), followed by the list of faster starters (LEADERS) that we would expect to front run today.  The first figure after the horse’s name shows the chance that we think that runner has of leading (or being a lagger) today; the figure after the horse’s name indicates its stall position today.  So, for the list below, we think that Fullanby and Fitz Flyer each have an approx. 30% chance of starting slowest today, and we think it is 46% likely that Masta Plasta will be prominent and/or lead from the start.


LAGGERS:
Fullandby, 0.31, 4
Fitz Flyer, 0.27, 3
Look Busy, 0.20, 1
Kaldoun Kingdom, 0.17, 6
Tombi, 0.06, 2
Masta Plasta, 0.00, 5
LEADERS:
Masta Plasta, 0.46, 5
Tombi, 0.28, 2
Look Busy, 0.21, 1
Fullandby, 0.05, 4
Kaldoun Kingdom, 0.00, 6
Fitz Flyer, 0.00, 3

As with using any tool for race analysis, the analysis does not stop on one rating, and it’s important to interpret these figures in the context of today’s race.  Concentrating purely on who will front run for a second, we also note that our second ranked front runner, Tombi, is wearing first time cheekpieces, which it’s possible may bring about earlier speed.

On the race itself, it’s a decent quality race, as we can tell from the Class (2) and prize money on offer, so running styles alone are unlikely to tell us which horse will win (in a lower class race where all horses are exposed, stealing a lead on an average field can be a bigger advantage).  On this score, Masta Plasta is well rated, but has less potential for improvement than the rest of these, at age 7.  Plus, he has not won since 2008.  Furthermore, a 46% chance of being the front runner, still means a 54% chance against.  Even so, he’s our likeliest contender, some way ahead of the rest, and very unlikely to start slowly.  Tehrefore, at 6.0 + he is worth considering as a back to lay bet (but not on this evidence alone to win) since he should be well in contention until the closing stages of the race.

Lots more work can be done on the analysis of this or any other race using these ratings – bringing into play speed figures and relative ability for example.  One of the nice features of Betwise’s leader/lagger ratings is that you can use and interpret them as you wish as an input for your own analysis, whatever betting angle you are looking at – from laying slow starters, backing to lay front runners, to predicting pace and win strategies.

Draw and Pace: Chester 2.55, Saturday

Saturday, May 22nd, 2010

Today’s Lambs Navy Rum Handicap at Chester is an inauspicious affair – a class 5 handicap worth £4047 to the winner.

However, it’s of particular interest to us because the race throws up a chance to apply the analysis we’ve been looking at recently with regard to draw and pace in sprint races.

Chester is our favourite example for draw bias, and we’ve done lots of research which shows the continued profitability of backing stalls 1 and 2 blind in larger fields over 5 furlongs.   In fact, it’s a trend you can make an automatic profit with over the long run, since it is usually underbet – though of course the prices that make this true can change in future.

What will not change is the natural advantage handed to horses drawn low.   As we saw earlier in the month at the May festival, an analysis of front runners can also help determine who will get to the rail early and stay there.   Speedy sorts can overcome the natural advantage of stalls 1 and 2 by beating those runners to the rail – as in the case of Masamah who made all to win from stall 3 in a 5 furlong sprint at the May festival – though we won’t bother to look at anything drawn higher than halfway in this 12-runner field, so we discount anything higher than 6.

Betwise use a front runner prediction method derived from in-running comments in the Smartform database, that we will be describing in detail in the June edition of Racing Ahead.   Applying this method to the 2.55 today, our top 6 (in order) for those most likely to break early and lead from the front are:

Front Runner Ranking Draw Today Betfair Price*
Harry Up 5 8.80
Legal Eagle 3 5.70
Sir Geoffrey 4 3.25
Red Rosanna 11 50.00
Not My Choice 1 7.60
Baby Queen 8 40.00
*Price at time of blog post

Of those drawn in the top 6, only Memphis Man (drawn 2) and Radiator Rooney (drawn 6) do not make the cut as previous front runners that rank as likely to lead, though no doubt their jockeys will/ should do everything to encourage them. Red Rosanna and Baby Queen, despite being in our top 6 ranked front runners, will be discounted since they are both drawn higher than 6.  Which leaves us with a shortlist of 4 – Not My Choice, Harry Up, Legal Eagle and Sir Geoffrey in the 12 runner field, before looking at any individual horse’s recent form or ability.

We still think that draw is the most important factor at Chester (meaning we’d be reluctant to go against Not My Choice in stall 1), though the top ranked front running ability of Harry Up may be enough to overcome his poorer draw in stall 5.   At this point in the game it’s time for individual choice and weighing up each horse’s potential to win against its price (at prices of 7.6 and 8.8 for the two horses mentioned, you can make a reasonable argument for value already).   Whatever the individual bettor’s view, discounting over 60% of the field makes that task much easier.

Chester May Festival – the draw revisited

Wednesday, May 12th, 2010

The draw bias at Chester racecourse, particularly over sprint distances, is a favourite example trend, simply because it is so pronounced when analyzed quantitatively.

Blindly backing any horse drawn in stall 1 and stall 2 in sprint races (defined as being over 5 and 6 furlongs) in larger fields (over 10 runners) has produced a consistent profit at starting price (and even more at Betfair SP) over the past few years.  Even in smaller fields and over longer distances, the first starting point for analyzing races at Chester should be the draw, though it’s always worth revisiting any assumption, especially when there is new information.

Last week at Chester, there were only 2 races qualifying with a larger number of runners (10+) over minimum distances.   For stalls drawn 1 and 2 these fared as follows:

+---------------------+----------------+-------+------+-------------+--------+
| scheduled_time      | winner         | stall | SP   | num_runners | result |
+---------------------+----------------+-------+------+-------------+--------+
| 2010-05-05 15:15:00 | Look Busy      |     2 | 9.00 |          13 |      4 |
| 2010-05-05 15:15:00 | Royal Intruder |     1 | 8.00 |          13 |      7 |
| 2010-05-06 16:30:00 | Tasmeem        |     1 | 9.00 |          11 |      5 |
| 2010-05-06 16:30:00 | Rule Of Nature |     2 | 3.00 |          11 |      2 |
+---------------------+----------------+-------+------+-------------+--------+

The best result any one of these could manage was second – and that was for the Michael Stoute trained Rule of Nature, which went off at a short price indeed and should have had a lot more going for its chances than the draw alone.

If we extend our survey to 7 furlong races, we find one more qualifier which produces a winner as follows:

+----------+---------------------+-----------------+-------+-------+-------------+--------+
| distance | scheduled_time      | winner          | stall | SP    | num_runners | result |
+----------+---------------------+-----------------+-------+-------+-------------+--------+
| 1100     | 2010-05-05 15:15:00 | Look Busy       | 2     | 9.00  | 13          | 4      |
| 1100     | 2010-05-05 15:15:00 | Royal Intruder  | 1     | 8.00  | 13          | 7      |
| 1320     | 2010-05-06 16:30:00 | Tasmeem         | 1     | 9.00  | 11          | 5      |
| 1320     | 2010-05-06 16:30:00 | Rule Of Nature  | 2     | 3.00  | 11          | 2      |
| 1540     | 2010-05-07 16:30:00 | Lucky Numbers   | 1     | 5.50  | 12          | 3      |
| 1540     | 2010-05-07 16:30:00 | Dance And Dance | 2     | 11.00 | 12          | 1      |
+----------+---------------------+-----------------+-------+-------+-------------+--------+

Just the one winner from the extra 7 furlong race produces sufficient returns, even at SP, to cover blind faith in the draw advantage alone, but clearly more analysis is needed, even at Chester.  Whilst it is obvious to anyone who has seen the Roodee that the draw advantage gives a significant edge to any runner racing on towards the inside rail, there are other factors as well as the draw at work to enable runners to get to the inside rail – and to secure that advantage.  Not least is the ability of a horse to break and lead early.  Whilst the effects of the draw are important, the proportion of front runners who win sprint races is equally compelling as we discuss in our analysis of front runners in this month’s Racing Ahead.  Combine a front runner at Chester with any stall position that gives it the ability to cross to the rail early, and you have a powerful combination to give that horse a winning edge – especially, in the case of stall position, if the horses drawn on the inside are less capable front runners.  So what did win the two larger field sprints at Chester last week?

+---------------------+-------------+-------+-------+-------------+--------+
| scheduled_time      | winner      | stall | SP    | num_runners | result |
+---------------------+-------------+-------+-------+-------------+--------+
| 2010-05-05 15:15:00 | Masamah     |     3 | 10.00 |          13 |      1 |
| 2010-05-06 16:30:00 | Horseradish |     6 |  3.75 |          11 |      1 |
+---------------------+-------------+-------+-------+-------------+--------+

Masamah still had an excellent draw in stall 3, a history of running from the front, and indeed ran as follows:
made all, ridden over 1f out, stayed on well final furlong

a running style that fits the hypothesis well.

In the case of Horseradish, he raced on softer ground than normal, and was able to track the leaders and still win, as follows:
tracked leaders, headway to lead over 1f out, ridden and stayed on well final furlong

Clearly relative ability will always enable horses to win races whatever their draw, though being drawn 6 of 11 on softer ground was not a huge disadvantage.

We’ll be using the Smartform database to produce more analyis of this sort over the coming months, which combines both in-running styles and draw analysis.

Comparing horses from different sources – the solution

Friday, May 7th, 2010

In yesterday’s post we discussed the problem of using information from different data sources for research and automated betting, where the name of the horse differs according to the data source.

The most common problems are incorrect capitalization within horse names (eg.  Sea the Stars instead of Sea The Stars) and omission, misplacement and other misdemeanors with apostrophes.

Various programming solutions are presented to this problem in Automatic Exchange Betting, but there is an even simpler solution where one of the information sources is the Smartform Racing Database.  Firstly, we know the runner names in Smartform are correct, so we can use this as our master source.   Secondly, Smartform uses straightforward SQL, which provides for many basic operations on character strings, such as conversion to lower case and pattern matching (so that we have access to search and replace functions).

This means that we can easily convert a horse name in Smartform to an equivalent name which has no capitals, no whitespace and no apostrophes.  If we do the same transformation on the name from the target data source, our correct name will match our incorrect name, and we can start to use information from both data sources in our betting strategy.  If we want to keep the correct name, we just select the correct name to be displayed but match the information on the transformed names.

If you’re not familiar with these functions in MySQL, you can download a copy and test this sort of functionality out easily without selecting any database, as in the queries below:

#Sea The Stars would normally be a variable in the database, so the query would not need quotations around the name.
>select lower("SEA THE STARS");
;

This will produce the name sea the stars.

Or use the replace functions to produce a name without white space (which also applies to apostrophes):

#The replace function takes three arguments separated by commas - the string to transform, the elements to replace, and the string to replace it with, as in:
>select replace("SEA THE STARS", ' ', '');
;

which produces the name SEATHESTARS

#Put the above functions together within one statement to produce a horse name that can be matched against another without issues:
>select replace("SEA THE STARS", ' ', '');
;

So at last we get seathestars.

If you’re unfamiliar with SQL, the syntax can take a little getting used to, but on the whole is a gentler introduction than learning a programming language – and allows you to achieve an awful lot when it comes to horseracing analysis.

Performing the same operation on the target horse name in another database table let’s us match data up between horses using a table join without leaving the database.  Returning to our original example from yesterday, this means, for example, we could match any form or forecast odds data in Smartform with any market data available in Betfair.  Of course, automatically creating an additional database table of Betfair prices does some programming, though re-usable step by step code is provided in Automatic Exchange Betting for exactly this job.

Comparing horses from different sources – the problem

Thursday, May 6th, 2010

A recurring problem in developing automated betting strategies is accounting for differences in horse names from different data sources, when in fact each source is referring to the same animal.

We discuss the logic behind betting strategies that use different sources in Automatic Exchange Betting.   In summary, an automated betting strategy may require various inputs that are only available from multiple data sources, – just as a manual betting strategy does.  For example, one data source may contain a horse’s form, another may contain current exchange or bookmaker prices for the horse, and another a news feed we want to scan for information on a specific horse before betting.

This problem isn’t just limited to programming betting robots, it also applies to basic research – for example collecting and retrieving Betfair market prices for any given horse name, when the horse name you want to fetch prices for does not come from Betfair to start with.

In fact, the Betfair case is the most frequent issue that we deal with in automatic betting.    Take a few examples from today’s racing:

Raddy ‘ell Pauline runs in the 4.30 at Chester, Mioche d’Estruval runs in the 5.25 at Newton Abbot, What’s Occurrin runs in the 6.50 at Wetherby, and Mandy’s Princess runs in the 3.55 at Chester.   These horses are listed in Betfair as Raddy ell Pauline, Mioche DEstruval, Whats Occurrin, and Mandys Princess, respectively.

Spot the problem?   In most cases, Betfair simply misses the apostrophe from horses’ names as a matter of policy.   Occasionally there are also capitalization problems, as with the Betfair rendition of Mioche d’Estruval above.

Let’s imagine that these four horses came from an automated selection list produced by Smartform (which lists all the horse names correctly, ie. as they were registered by their owners).   We now want our betting robot to use the Betfair API to retrieve prices for each horse, and if those prices meet a certain minimum, we want to bet on each horse.

Unfortunately, if we simply present the correct horse names to our betting program we will be in trouble – the Betfair API won’t recognize them.   We’ll get neither the prices we asked for, nor will we be able to bet on these horses – or do anything else with the Betfair API for these runners unless we take some action first.

Fortunately there are a number of simple  approaches to resolving this, the easiest of which can be done within Smartform without resorting to using a programming language at all – more on this tomorrow.