Archive for April, 2019
Querying From Smartform Using Python
Friday, April 26th, 2019Python is now one of the most commonly used programming languages – at the time of writing 4th in popularity according to the TIOBE index. It’s also a popular choice for data manipulation and data science, with plenty of packages such as Pandas for preparing data and Scikit-learn for machine learning meaning that – like R – it can be an ideal environment to use for analysing horseracing data and building prediction models. Here, we discuss first steps to start using Python with Smartform (the MySQL horseracing database from Betwise) in order to connect to the database, run queries, and start using the data within the Python environment.
Installing ‘PyMySQL’
In order to query into a MySQL database directly from Python, the PyMySQL package needs to be installed. More can be read about the package requirements here:
This needs to be done outside of the IPython Shell using the ‘pip’ command. In a command prompt (if using Windows) or bash prompt (if using macOS) use
$ pip install pymysql
More information on how to install packages for python can be found here:
Importing ‘PyMySQL’
After doing so, the pymysql package then needs to be imported into the IPython Shell, by running the following code:
import pymysql
Now you have the necessary package installed to make contact between Python and the MySQL database.
Establishing a Connection
In order to connect the IPython Shell to the MySQL database you will need to know your MySQL database credentials. These details would have been inputted by the user when creating the MySQL database.
These following details are: host name, user name, password and database name (e.g. ‘smartform’). These details need to be inputted into the strings below.
Note: The password has been filled with asterisks for security reasons but do enter your actual password here
# Inputting database credentials
connection = pymysql.connect(host='localhost', user='root', passwd ='********', database = 'smartform')
If all of the credentials are correct, the connection should be established and the code should run without an error message. If for whatever reason this code doesn’t work, make sure you have entered the correct details and you have imported the ‘pymysql’ package correctly.
Creating a Cursor
To be able to make queries from the MySQL database, a cursor needs to be created. The cursor is effectively a control structure that enables traversal over the records in a database. This can be done by simply running the following code.
cursor = connection.cursor()
Making a Query
Write out your desired query as a string, as you would normally write out a SQL query.
The following query is an example of how to return all of the unique runners (and their associated runner names) with an OR > 160 and then order the runners’ names alphabetically.
query = '''SELECT DISTINCT runner_id, name
FROM historic_runners
WHERE official_rating > 160
ORDER BY name ASC
'''
Then call this query into the ‘cursor.execute’ function (this will return the number of records in the query as an output).
cursor.execute(query)
Then use ‘cursor.fetchall()’ to retrieve all of the data entries corresponding to this query, calling it into a variable.
rows = cursor.fetchall()
Use a ‘for loop’ to print and inspect the query output.
for row in rows[:10]:
print(row)
(514205, 'Afsoun')
(2048799, 'Agrapart')
(1435705, 'Al Ferof')
(547061, 'Albertas Run')
(1692968, 'Alelchi Inois')
(160435, 'Alexander Banquet')
(229855, 'Allegedly Red')
(2037368, 'Altior')
(447515, 'Andreas')
(2104005, 'Anibale Fly')