A common example of how one might use fitzRoy is for creating a simple ELO rating system. These models are common for tippers that are part of The Squiggle and also becoming common in other team sports. This vignette shows a minimum working example to get you started on creating an ELO model from scratch, using fitzRoy to get data and the elo package to do the modelling.

Prepare data

Before we create our model, some data preparation. In the ELO package we are using, we need a way to identify each round as a separate match, so we’ll combine season and Round.Number into a string as a unique identifier when combined with the team name. We also need a way to tell it when a new season is starting, so we’ll create a logical field that indicates if the game is the first game for a team that season.

For the fixture data, we need to ensure the dates are in the same format as results (note - this should probably be done internally in fitzRoy - see #58). For now, we can do it manually.

fixture <- fixture %>%
  filter(Date > max(results$Date)) %>%
  mutate(Date = ymd(format(Date, "%Y-%m-%d"))) %>%
  rename(Round.Number = Round)

Set ELO parameters

There are a range of parameters that we can tweak and include in ELO model. Here we set some basic parameters - you can read a bit more on the PlusSixOne blog, which uses a similar method. For further reading, I strongly recommend checking out Matter of Stats or The Arc for great explainers on the types of parameters that could be included.

Map margin function

The original ELO models in chess use values of 0 for a loss, 1 for a win and 0.5 for a draw. Since we are adapting these for AFL and we want to use the margin rather than a binary outcome, we need to map our margin to a score between 0 and 1. You can do this in many varied and complex ways, but for now, I just normalise everything based on a margin of -80 to 80. Anything outside of this goes to the margins of 0 or 1.

We create that as a function and then use that function in our elo model.

Calculate ELO results

Now we are ready to create our ELO ratings! We can use the elo.run function from the elo package for this. I won’t explain everything about what is going on here - you can read all about it at the package vignette - but in general, we provide a function that indicates what is included in our model, as well as some model parameters.

Now that is run, we can view our results. The elo package provides various ways to do this.

Firstly, using as.data.frame we can view the predicted and actual result of each game. Also in this table is the change in ELO rating for the home and away side. See below for the last few games of 2018.

We can specifically focus on how each team’s rating changes over time using as.matrix. Again - viewing the end of 2018 also shows teams that didn’t make the finals have the same ELO as the rounds go on since they aren’t playing finals.

as.matrix(elo.data) %>% tail()
#>         Adelaide Brisbane Lions  Carlton Collingwood Essendon Fitzroy
#> [2797,] 1508.327       1512.678 1479.576    1518.468 1506.633    1500
#> [2798,] 1517.526       1519.499 1481.808    1519.544 1506.402    1500
#> [2799,] 1513.723       1519.784 1482.805    1514.528 1510.206    1500
#> [2800,] 1510.322       1523.482 1486.206    1509.509 1510.311    1500
#> [2801,] 1511.153       1524.036 1483.506    1515.349 1501.789    1500
#> [2802,] 1511.281       1530.720 1481.838    1517.052 1490.689    1500
#>         Footscray Fremantle  Geelong Gold Coast      GWS Hawthorn
#> [2797,]  1492.504  1491.863 1539.638   1450.453 1522.541 1492.927
#> [2798,]  1492.205  1488.880 1540.182   1441.254 1519.439 1495.910
#> [2799,]  1488.996  1488.312 1535.076   1440.257 1524.456 1501.016
#> [2800,]  1493.990  1483.318 1538.171   1440.152 1524.757 1497.318
#> [2801,]  1493.435  1488.282 1533.207   1434.311 1523.227 1495.561
#> [2802,]  1504.535  1488.302 1538.386   1427.627 1514.582 1504.206
#>         Melbourne North Melbourne Port Adelaide Richmond St Kilda   Sydney
#> [2797,]  1477.110        1506.836      1511.272 1502.023 1468.511 1497.013
#> [2798,]  1477.409        1507.066      1504.451 1505.125 1467.967 1494.781
#> [2799,]  1476.162        1506.782      1500.582 1508.995 1471.176 1495.349
#> [2800,]  1474.506        1501.945      1500.280 1514.014 1472.833 1492.254
#> [2801,]  1470.654        1503.702      1508.802 1517.865 1472.001 1493.784
#> [2802,]  1468.952        1498.523      1513.389 1519.534 1471.981 1489.197
#>         University West Coast
#> [2797,]       1500   1521.627
#> [2798,]       1500   1520.551
#> [2799,]       1500   1521.798
#> [2800,]       1500   1526.634
#> [2801,]       1500   1529.334
#> [2802,]       1500   1529.206

Lastly, we can check the final ELO ratings of each team at the end of our data using final.elos (here - up to end of 2018).

We could keep tweaking our parameters until we are happy. Ideally we’d have a training and test set and be using some kind of cost function to optimise these values on like a log likelihood, mean absolute margin or something similar. I’ll leave that as beyond the scope of this vignette though and assume we are happy with these parameters.

Do predictions

Now we’ve got our ELO model and are happy with our parameters, we can do some predictions! For this, we just need to use our fixture and the prediction function with our ELO model as an input. The elo package takes care of the result.

From here - you could turn these probabilities back into a margin through another mapping function. Again - I’ll leave that for the reader to decide.

Looking forward to seeing all the new models utilising the power of fitzRoy.