Blog

This isn’t yet another blog article on why static site generators are much more suitable for powering small blogs than full dynamic sites, although I was very tempted to write it that way. Instead, it is focused on the merits of automating the build process, and providing a reference for how to set it up for Hugo sites hosted on Amazon’s S3 in the hope that I can save someone some time in the future.

CONTINUE READING

I’ve just had some of my PhD research on adapting Echo State Networks (ESNs) for identifying Parkinson’s disease published. The work describes considerations to be made when applying ESNs to classification problems, with a case study of using them to differentiate between Parkinson’s Disease patients and healthy subjects based on a longitudinal positional data source. This post will briefly summarise the work, but in case you’re interested the published version is available at the publisher’s website, while I’ve uploaded a preprint here.

CONTINUE READING

This post continues on from the mid-season review of the Elo system and looks at my Bayesian football prediction model, Predictaball, up to and including matchday 20 of the Premier League (29th December). I’ll go over the overall predictive accuracy and compare my model to others, including bookies, expected goals (xG), and a compilation of football models. Overall accuracy So far, across the top 4 European leagues, there have been 696 matches with 379 (54%) of these outcomes being correctly predicted.

CONTINUE READING

This is going to be the first of 2 posts looking at the mid-season performance of my football prediction and rating system, Predictaball. In this post I’m going to focus on the Elo rating system. Premier league standings I’ll firstly look at how the teams in the Premiership stand, both in terms of their Elo rating and their accumulated points, as displayed in the table below, ordered by Elo. Over-performing teams, as defined by being at least 3 ranks higher in points than in Elo, are coloured in green, while under-performing teams, the opposite, are highlighted in red.

CONTINUE READING

I’ve just released a new package onto CRAN and while it doesn’t perform any complex calculations or fit a statistical niche, it may be one of the most useful everyday libraries I’ll write. In short, epitab provides a framework for building descriptive tables by extending contingency tables with additional functionality. I initially developed it for my work in epidemiology, as I kept coming across situations where I wanted to programmatically generate tables containing various descriptive statistics to facilitate reproducible research, but I could not find any existing software that met my requirements.

CONTINUE READING

My football prediction has previously relied upon a Bayesian approach to quantify a team’s skill level, by modelling it as a random intercept in a hierarchical model of the outcome of a match. While this model performed very well (62% accuracy last season), I was never fully satisfied since this measure of skill is an average across the last ten seasons that I had data for, rather than being updated to reflect the time-varying nature of form.

CONTINUE READING

The last post showed that using a fully Bayesian multi-level model of the match outcomes helped Predictaball achieve a 58% overall prediction accuracy on the four European leagues, up 8% from last season. This post will describe the betting system I used to try and profit by identifying value bets in the offered odds. Betting system Before delving into the profit analysis I’ll firstly quickly summarise the staking model I used since I haven’t mentioned it anywhere before.

CONTINUE READING

And so we come to the end of another season of football, and more importantly, Predictaball! This season has seen several large updates that I was meaning to detail these at the start of the season but life got in the way. The predictive model is now fully Bayesian I’ve added a betting system that identifies value bets I’ve expanded it to include the 3 other main European leagues: La liga Serie A Bundesliga Rather than detailing these new aspects as well as summarising the season’s performance in one massive blog, I’ll split this into two parts.

CONTINUE READING

Camel Up is a deceptively simple board game in which the aim is to predict the outcome of a camel race. I’ll quickly try to explain the game now, although it’s always hard to explain a boardgame without an actual demonstration. The camel movement is randomly generated from dice rolls as follows. Five dice coloured for each of the five camels, each labelled with the numbers 1-3 twice, are placed into a container (decorated as a pyramid, since the game is set in Egypt), which is then shaken.

CONTINUE READING

I’ve never really been much of a hacker, I much prefer to think my projects through entirely and plan them out on pen and paper before starting to write any code. As such I’ve never really had much interest in a hackathon. With a bit of apprehension then I participated in my first one over the weekend. The particular event was NASA Space Apps, where NASA provide lots of data and offer challenges related to modelling certain natural phenomena, providing data visualisation, or prototype hardware tools that fit a particular niche.

CONTINUE READING