vignettes/elo-ratings-example.Rmd
elo-ratings-example.RmdA common example of how one might use fitzRoy is for creating a simple ELO rating system. These models are common for tippers that are part of The Squiggle and also becoming common in other team sports. This vignette shows a minimum working example to get you started on creating an ELO model from scratch, using fitzRoy to get data and the elo package to do the modelling.
First we need to grab a few packages. If you don’t have any of these, you’ll need to install them.
Our first job is to now get the relevant data. For the most basic of ELO models, we need to have the results of past matches that includes the home and away team and the score of the match. To do our predictions, we also need upcoming matches. We can get both of those things using fitzRoy.
# Get data
results <- fitzRoy::get_match_results()
fixture <- fitzRoy::get_fixture(2019)results <- results %>% filter(Date < "2019-01-01")
tail(results)
#> # A tibble: 6 x 16
#> Game Date Round Home.Team Home.Goals Home.Behinds Home.Points Away.Team
#> <dbl> <date> <chr> <chr> <int> <int> <int> <chr>
#> 1 15402 2018-09-08 QF West Coa… 12 14 86 Collingw…
#> 2 15403 2018-09-14 SF Melbourne 16 8 104 Hawthorn
#> 3 15404 2018-09-15 SF Collingw… 9 15 69 GWS
#> 4 15405 2018-09-21 PF Collingw… 15 7 97 Richmond
#> 5 15406 2018-09-22 PF West Coa… 18 13 121 Melbourne
#> 6 15407 2018-09-29 GF West Coa… 11 13 79 Collingw…
#> # … with 8 more variables: Away.Goals <int>, Away.Behinds <int>,
#> # Away.Points <int>, Venue <chr>, Margin <int>, Season <dbl>,
#> # Round.Type <chr>, Round.Number <int>fixture <- fixture %>% filter(Date > "2019-01-01")
head(fixture)
#> # A tibble: 6 x 7
#> Date Season Season.Game Round Home.Team Away.Team Venue
#> <dttm> <dbl> <int> <dbl> <chr> <chr> <chr>
#> 1 2019-03-21 19:25:00 2019 1 1 Carlton Richmond M.C.G.
#> 2 2019-03-22 19:50:00 2019 1 1 Collingwood Geelong M.C.G.
#> 3 2019-03-23 13:45:00 2019 1 1 Melbourne Port Adela… M.C.G.
#> 4 2019-03-23 16:05:00 2019 1 1 Adelaide Hawthorn Adelaid…
#> 5 2019-03-23 19:20:00 2019 1 1 Brisbane Li… West Coast Gabba
#> 6 2019-03-23 19:25:00 2019 1 1 Footscray Sydney Docklan…Before we create our model, some data preparation. In the ELO package we are using, we need a way to identify each round as a separate match, so we’ll combine season and Round.Number into a string as a unique identifier when combined with the team name. We also need a way to tell it when a new season is starting, so we’ll create a logical field that indicates if the game is the first game for a team that season.
results <- results %>%
mutate(seas_rnd = paste0(Season, ".", Round.Number),
First.Game = ifelse(Round.Number == 1, TRUE, FALSE))For the fixture data, we need to ensure the dates are in the same format as results (note - this should probably be done internally in fitzRoy - see #58). For now, we can do it manually.
fixture <- fixture %>%
filter(Date > max(results$Date)) %>%
mutate(Date = ymd(format(Date, "%Y-%m-%d"))) %>%
rename(Round.Number = `Round`)
head(fixture)
#> # A tibble: 6 x 7
#> Date Season Season.Game Round.Number Home.Team Away.Team Venue
#> <date> <dbl> <int> <dbl> <chr> <chr> <chr>
#> 1 2019-03-21 2019 1 1 Carlton Richmond M.C.G.
#> 2 2019-03-22 2019 1 1 Collingwood Geelong M.C.G.
#> 3 2019-03-23 2019 1 1 Melbourne Port Adelai… M.C.G.
#> 4 2019-03-23 2019 1 1 Adelaide Hawthorn Adelaide…
#> 5 2019-03-23 2019 1 1 Brisbane Li… West Coast Gabba
#> 6 2019-03-23 2019 1 1 Footscray Sydney DocklandsThere are a range of parameters that we can tweak and include in ELO model. Here we set some basic parameters - you can read a bit more on the PlusSixOne blog, which uses a similar method. For further reading, I strongly recommend checking out Matter of Stats or The Arc for great explainers on the types of parameters that could be included.
The original ELO models in chess use values of 0 for a loss, 1 for a win and 0.5 for a draw. Since we are adapting these for AFL and we want to use the margin rather than a binary outcome, we need to map our margin to a score between 0 and 1. You can do this in many varied and complex ways, but for now, I just normalise everything based on a margin of -80 to 80. Anything outside of this goes to the margins of 0 or 1.
We create that as a function and then use that function in our elo model.
Now we are ready to create our ELO ratings! We can use the elo.run function from the elo package for this. I won’t explain everything about what is going on here - you can read all about it at the package vignette - but in general, we provide a function that indicates what is included in our model, as well as some model parameters.
# Run ELO
elo.data <- elo.run(
map_margin_to_outcome(Home.Points - Away.Points) ~
adjust(Home.Team, HGA) +
Away.Team +
group(seas_rnd) +
regress(First.Game, 1500, carryOver),
k = k_val,
data = results
)Now that is run, we can view our results. The elo package provides various ways to do this.
Firstly, using as.data.frame we can view the predicted and actual result of each game. Also in this table is the change in ELO rating for the home and away side. See below for the last few games of 2018.
as.data.frame(elo.data) %>% tail()
#> team.A team.B p.A wins.A update.A update.B
#> 15402 West Coast Collingwood 0.5403370 0.60000 1.19325978 -1.19325978
#> 15403 Melbourne Hawthorn 0.5781956 0.70625 2.56108839 -2.56108839
#> 15404 Collingwood GWS 0.5673819 0.56250 -0.09763852 0.09763852
#> 15405 Collingwood Richmond 0.5166371 0.74375 4.54225889 -4.54225889
#> 15406 West Coast Melbourne 0.5179440 0.91250 7.89112069 -7.89112069
#> 15407 West Coast Collingwood 0.5486647 0.53125 -0.34829414 0.34829414
#> elo.A elo.B
#> 15402 1534.675 1534.198
#> 15403 1552.201 1522.293
#> 15404 1534.100 1517.187
#> 15405 1538.643 1547.993
#> 15406 1542.566 1544.309
#> 15407 1542.218 1538.991We can specifically focus on how each team’s rating changes over time using as.matrix. Again - viewing the end of 2018 also shows teams that didn’t make the finals have the same ELO as the rounds go on since they aren’t playing finals.
as.matrix(elo.data) %>% tail()
#> Adelaide Brisbane Lions Carlton Collingwood Essendon Fitzroy Footscray
#> [2776,] 1498.394 1486.347 1425.744 1535.573 1511.301 1500 1463.709
#> [2777,] 1507.172 1483.508 1416.965 1535.391 1514.869 1500 1466.641
#> [2778,] 1507.172 1483.508 1416.965 1534.198 1514.869 1500 1466.641
#> [2779,] 1507.172 1483.508 1416.965 1534.100 1514.869 1500 1466.641
#> [2780,] 1507.172 1483.508 1416.965 1538.643 1514.869 1500 1466.641
#> [2781,] 1507.172 1483.508 1416.965 1538.991 1514.869 1500 1466.641
#> Fremantle Geelong Gold Coast GWS Hawthorn Melbourne
#> [2776,] 1459.911 1533.937 1427.196 1515.733 1525.492 1543.083
#> [2777,] 1460.093 1540.193 1420.941 1511.744 1527.217 1547.072
#> [2778,] 1460.093 1537.625 1420.941 1517.090 1524.854 1549.640
#> [2779,] 1460.093 1537.625 1420.941 1517.187 1522.293 1552.201
#> [2780,] 1460.093 1537.625 1420.941 1517.187 1522.293 1544.309
#> [2781,] 1460.093 1537.625 1420.941 1517.187 1522.293 1544.309
#> North Melbourne Port Adelaide Richmond St Kilda Sydney University
#> [2776,] 1509.214 1509.768 1553.105 1454.495 1516.356 1500
#> [2777,] 1511.378 1506.200 1550.173 1452.330 1514.631 1500
#> [2778,] 1511.378 1506.200 1552.536 1452.330 1509.285 1500
#> [2779,] 1511.378 1506.200 1552.536 1452.330 1509.285 1500
#> [2780,] 1511.378 1506.200 1547.993 1452.330 1509.285 1500
#> [2781,] 1511.378 1506.200 1547.993 1452.330 1509.285 1500
#> West Coast
#> [2776,] 1530.643
#> [2777,] 1533.481
#> [2778,] 1534.675
#> [2779,] 1534.675
#> [2780,] 1542.566
#> [2781,] 1542.218Lastly, we can check the final ELO ratings of each team at the end of our data using final.elos (here - up to end of 2018).
final.elos(elo.data)
#> Adelaide Brisbane Lions Carlton Collingwood Essendon
#> 1507.172 1483.508 1416.965 1538.991 1514.869
#> Fitzroy Footscray Fremantle Geelong Gold Coast
#> 1380.902 1466.641 1460.093 1537.625 1420.941
#> GWS Hawthorn Melbourne North Melbourne Port Adelaide
#> 1517.187 1522.293 1544.309 1511.378 1506.200
#> Richmond St Kilda Sydney University West Coast
#> 1547.993 1452.330 1509.285 1412.936 1542.218We could keep tweaking our parameters until we are happy. Ideally we’d have a training and test set and be using some kind of cost function to optimise these values on like a log likelihood, mean absolute margin or something similar. I’ll leave that as beyond the scope of this vignette though and assume we are happy with these parameters.
Now we’ve got our ELO model and are happy with our parameters, we can do some predictions! For this, we just need to use our fixture and the prediction function with our ELO model as an input. The elo package takes care of the result.
fixture <- fixture %>%
mutate(Prob = predict(elo.data, newdata = fixture))
head(fixture)
#> # A tibble: 6 x 8
#> Date Season Season.Game Round.Number Home.Team Away.Team Venue Prob
#> <date> <dbl> <int> <dbl> <chr> <chr> <chr> <dbl>
#> 1 2019-03-21 2019 1 1 Carlton Richmond M.C.G. 0.359
#> 2 2019-03-22 2019 1 1 Collingwood Geelong M.C.G. 0.545
#> 3 2019-03-23 2019 1 1 Melbourne Port Adel… M.C.G. 0.597
#> 4 2019-03-23 2019 1 1 Adelaide Hawthorn Adela… 0.521
#> 5 2019-03-23 2019 1 1 Brisbane L… West Coast Gabba 0.459
#> 6 2019-03-23 2019 1 1 Footscray Sydney Dockl… 0.482From here - you could turn these probabilities back into a margin through another mapping function. Again - I’ll leave that for the reader to decide.
Looking forward to seeing all the new models utilising the power of fitzRoy.