I’ve recently expanded my hierarchical Bayesian football (aka soccer) prediction football prediction framework to predict the results of Australian Rules Football (AFL) matches. I have no personal interest in AFL, instead I got involved through an email sent to a statistics mailing list advertising a competition that’s held by Monash University in Melbourne. Sensing an opportunity to quickly adapt my soccer prediction method to AFL results and to compare my technique to others, I decided to get involved.
I’ve always been curious to know if any of the 4 major European leagues (Serie A, Bundesliga, Premiership, La Liga) are more predictable than others. La Liga certainly has a reputation as being dull and predictable, although this is due to the sheer dominance of Barcelona and Real Madrid in recent years. I’ve increased my database of football matches in order to improve my football prediction bot this summer, and so now have sufficient data to investigate.
This post summarises Predictaball’s performance in the 2015-2016 season. I’ll look at overall performance, accuracy per week, how it fared in terms of making profit, and finally the annual comparison with Lawro.
Compared to last year when it achieved 48% overall, Predictaball has fared less well this season with 43%. This isn’t largely surprising since this season has been full of surprises to say the least, with Leicester beating out the traditional top four for the title, and Spurs doing their best to break the monopoly (despite failing in typical Spurs fashion).
I’ve tinkered around with Predictaball a bit recently in an effort to increase its accuracy, with the overall goal of beating Paul Merson and Lawro so that I can claim ‘human competitiveness’. I’ve mentioned in previous posts that I envisage 2 potential ways to achieve this.
Include more player data Incorporate bookies odds Adding more player data (such as a variable for each player indicating whether they are in the squad or not) would allow the model to account for situations when a player who is strongly associated with the team winning is now injured - for an example see City’s abysmal record when Kompany isn’t playing.
It’s been a while since I’ve posted anything as I’ve spent my summer in a thesis related haze, which I’m starting to come out of now so expect more frequent updates - particularly as I work my way through the backlog of ideas I’ve been meaning to write about.
I’ll start with assessing Predictaball’s performance last season. Just to summarise, this was a classification task attempting to predict the outcome (W/L/D) of every premier league match from the end of September onwards.
The majority of my work is involved with machine learning using biologically inspired techniques, focusing on classification problems. I run my algorithms on benchmark datasets to test their validity and the effect of various parameters, and then these are used in real life medical applications. Trials can take a long time to prepare, and the data collection process can be somewhat challenging. The group I’m involved with researchs Neurodegenerative Diseases, particularly Parkinson’s Disease.