If you have used betting exchange markets before you might have noticed that you can place bets at starting price. This means that the bet gets matched at the start of the event at a price that is derived based on available liquidity in the market. Examples of starting prices are BSP for Betfair Starting Price or XSP (Exchange Starting Price) for the starting price on BETDAQ. If you want to level up your sports betting game working with historical starting prices can help you in different ways. In this article I will discuss what you can do with historical exchange starting prices. I will also show how you can process and analyse starting prices.
Use Cases for Historical Starting Price Data
You might wonder what you can do with historical data from a sports betting exchange. I can think of two use cases where historical starting prices can help you to have more success with sports betting:
- Strategy Development and Back Testing
When you have an idea for a betting system and want to test it on historical data to see if it was profitable in the past, you can download, process and analyse the data to calculate profit as past performance of the betting or trading system.
- Monitoring of Betting Bots
You can use historical data to compare with live execution of your betting or trading bot. Especially if your betting automation was developed based on a back test you want to make sure that the betting pattern of the betting bot aligns with the back test on historical data. Re-running the back test on new data and comparing with your sports bets will show if you have any systematic error.
Data Sources for Historical Starting Price Data
When scanning the web for historical sports betting data you might come across various offers where you can buy past data. However, there are also a couple of data sources that publish data for free and some of them include starting price data.
Starting Price Data on promo.betfair.com
On promo.betfair.com/betfairsp/prices/ Betfair publishes data that contains starting price information for the following betting markets:
- Horse Racing and Greyhound Racing
- Win and Place market
- UK (United Kingdom), IRE (Ireland), RSA ( South Africa), USA (United States of America), AUS (Australia)
The files follow a certain naming convention and look like the following examples:
When you open one of these csv files you will find that every row corresponds to a selection in a race. Each row contains metadata such as the event, selection, starting price and the amount that was traded.
Starting Price Data via Betfair Historical Data
Betfair offers historical exchange data on historicdata.betfair.com. As a member you can download historical price data from the exchange. This data also contains starting prices. I have covered how you can use Python to work with Betfair historical data but I would like to mention that it also contains starting price information.
In the following screenshot you can see the BSP in the historical data file.
Unfortunately I was not able to find any data source that would offer historical data for XSP on Betdaq for free.
Please let me know if you know of any other data source that offers starting price data!
Analysing Starting Price Data with Excel / VBA
Given that some of the data sources contain csv files you could think about using software such as Excel to analyse the data. However, even if you focus on one sport in one country only (e.g. horse racing UK) you will have to deal with around 365 csv files per year. I wouldn't even know how to calculate the average BSP unless I download all the files, copy over the values and use the Excel formula. To me this doesn't seem very comfortable nor scalable.
I am not saying that it is impossible to process and analyse sports betting data such as the BSP with Excel and/or VBA. I am not too familiar with all the Excel functionality in detail and it might be that you can handle this amount of data with Excel and some VBA automation. BSP data from the source mentioned above is also not 100% clean which might be challenging to handle in Excel.
Data Processing Pipeline for Starting Price Data
Ideally you would like to automate the data processing so that I can focus on data analysis. I would like to have a script that automatically downloads the data, cleans the data and saves it to a database in a predefined format that I can easily use for data analysis. The pipeline that I am using looks like the following:
In a Jupyter Notebook I have some Python code that does the processing for me. First I set a date range, select desired sports and markets and then I trigger the processing pipeline. The csv files are loaded into memory one after another using the requests package.
Analysing Starting Price Data with SQL
Now that the data is saved in a simple SQL table I can work with SQL statements to analyse sports betting data. You can use different interfaces such as pgadmin for instance if you are using PostgreSQL.
Another alternative is to use a SQL client in your favourite programming language. Most of my scripts currently use SQLAlchemy with Python.
Analysing Starting Price Data with SQL and Python
If you want to develop more complex betting systems you might want to combine SQL along with Python. You could load the raw data from the database into memory and use Python for more complex transformations. For instance you can train a Machine-Learning model on top of the data similar to this strategy.