The goal of fitzRoy is to make it easy to access data from the AFLM and AFLW competitions. It provides a simple and consistent API to access data such as match results, fixtures and player statistics from multiple data sources.

Fetching Data

Primarily, fitzRoy can be used to access data from various sources using the fetch_ functions. For a detailed view on how the API works - view the Main Fetch Functions vignette.

Data Sources

There are 5 main data sources for data in fitzRoy. Where possible, we do not edit the data from how we receive it, although in some cases, we do need to aggregate and calculate certain fields based on the structure of the site.

You can choose your data source as an argument to any fetch_ function using the source = argument.

AFL website

We provide data from the (AFL website)[https://www.afl.com.au/] as the default to any fetch_ function. This data is from the official AFL data provider. With this data, we can return data for both the Mens and Womens competitions. The oldest data is from 2012. It provides access to all data types including results, fixtures, ladders, lineups and stats.

AFL tables

AFL Tables has historically been the main source of data in fitzRoy. It is the most complete source of data about AFL that exists (to our knowledge at least!). It contains data from 1897 and is the only data source included in fitzRoy with such historical data. The types of data it contains are results, ladders and stats.

Footywire

Footywire has traditionally been the main source of player statistics in fitzRoy. It contains data dating back to 2012 and was generally used as a supplement to AFL Tables data. The types of data it returns are results, fixtures and statistics.

Squiggle

Squiggle is a famous AFL Prediction and Analysis website run by Max Barry. In recent years, Squiggle has become the main place to aggregate various predictive models. Max has provided a nice and well documented API that fitzRoy uses to return data. Helper functions included in the fetch_ family will return results, fixtures and ladders but the fetch_squiggle_data function provides direct access to the API. Read the Squiggle API vignette for more details.

Fryzigg

Twitter user Fryzigg has provided access to some advanced player statistics. These are included in the fetch_player_stats function. Read the Fryzigg API vignette for more information.

Good practices

In most cases, trying to use the same source for all of your analysis will be most beneficial. This is not always possible as some sources only go back so far (the AFL website only has data back to 2011), while some data is not available (AFL Tables doesn’t have decent fixture data). If you are mixing sources, be careful to understand differences in naming structures, team names and player names.

It is also a good idea to avoid regularly fetching whole datasets. Where possible, try to keep an off-line version of your data and only request the smallest amount possible to get the new data you require. This is both faster (less data transferred over your Internet connection and less data living in your computer memory) but also helps to reduce traffic on the data providers servers.

Examples

Fixture

Fixture data is available from multiple places. The most reliable and complete data usually comes from the AFL website. From that website you can specify either the Mens or Womens competitions using the comp argument.

fixture <- fetch_fixture(2021, comp = "AFLW")
fixture %>%
  select(utcStartTime, round.name, 
         home.team.name, away.team.name, venue.name)
#> # A tibble: 63 x 5
#>    utcStartTime      round.name home.team.name  away.team.name  venue.name      
#>    <chr>             <chr>      <chr>           <chr>           <chr>           
#>  1 2021-01-28T08:15… Round 1    Carlton         Collingwood     Ikon Park       
#>  2 2021-01-29T08:10… Round 1    St Kilda        Western Bulldo… RSEA Park       
#>  3 2021-01-30T04:10… Round 1    Gold Coast Suns Melbourne       Metricon Stadium
#>  4 2021-01-30T06:10… Round 1    West Coast Eag… Adelaide Crows  Mineral Resourc…
#>  5 2021-01-31T01:10… Round 1    Geelong Cats    Kangaroos       GMHBA Stadium   
#>  6 2021-01-31T03:10… Round 1    Richmond        Brisbane Lions  Swinburne Centre
#>  7 2021-01-31T05:10… Round 1    Fremantle       GWS Giants      Fremantle Oval  
#>  8 2021-02-05T08:45… Round 2    Western Bulldo… Carlton         Victoria Univer…
#>  9 2021-02-06T04:10… Round 2    Collingwood     Geelong Cats    Victoria Park   
#> 10 2021-02-06T06:10… Round 2    Melbourne       Richmond        Casey Fields    
#> # … with 53 more rows

If wanted, you could return just a single round.

fetch_fixture(2021, round_number = 5, comp = "AFLM") %>%
  select(utcStartTime, round.name, 
         home.team.name, away.team.name, venue.name)
#> # A tibble: 9 x 5
#>   utcStartTime           round.name home.team.name   away.team.name venue.name  
#>   <chr>                  <chr>      <chr>            <chr>          <chr>       
#> 1 2021-04-15T09:20:00.0… Round 5    St Kilda         Richmond       Marvel Stad…
#> 2 2021-04-16T10:10:00.0… Round 5    West Coast Eagl… Collingwood    Optus Stadi…
#> 3 2021-04-17T03:45:00.0… Round 5    Western Bulldogs Gold Coast Su… Marvel Stad…
#> 4 2021-04-17T06:35:00.0… Round 5    Sydney Swans     GWS Giants     SCG         
#> 5 2021-04-17T09:25:00.0… Round 5    Carlton          Port Adelaide  MCG         
#> 6 2021-04-17T09:25:00.0… Round 5    Brisbane Lions   Essendon       Gabba       
#> 7 2021-04-18T03:10:00.0… Round 5    Adelaide Crows   Fremantle      Adelaide Ov…
#> 8 2021-04-18T05:20:00.0… Round 5    Hawthorn         Melbourne      MCG         
#> 9 2021-04-18T06:40:00.0… Round 5    Geelong Cats     North Melbour… GMHBA Stadi…

You can get results data from other sources including Squiggle and Footywire. The default source for fetch_results() is the AFL.com.au website.

fixture_afl <- fetch_fixture(2020)
fixture_aflw <- fetch_fixture(2020, round_number = 1, comp = "AFLW")
fixture_squiggle <- fetch_fixture_squiggle(2020, round_number = 10)
fixture_footywire <- fetch_fixture_squiggle(2018)

Lineup

You can get the lineup for a particular round. This is usually useful when running after the teams have been announced but before the match has been played.

The only data source with lineup data is the AFL.com.au website.

fetch_lineup(2021, round_number = 1, comp = "AFLW") %>%
  select(round.name, status, teamName, 
         player.playerName.givenName,
         player.playerName.surname, teamStatus)
#> # A tibble: 294 x 6
#>    round.name status  teamName player.playerName.… player.playerName… teamStatus
#>    <chr>      <chr>   <chr>    <chr>               <chr>              <chr>     
#>  1 Round 1    CONCLU… Carlton  Brooke              Vernon             FINAL_TEAM
#>  2 Round 1    CONCLU… Carlton  Natalie             Plane              FINAL_TEAM
#>  3 Round 1    CONCLU… Carlton  Vaomua              Laloifi            FINAL_TEAM
#>  4 Round 1    CONCLU… Carlton  Charlotte           Wilson             FINAL_TEAM
#>  5 Round 1    CONCLU… Carlton  Kerryn              Harrington         FINAL_TEAM
#>  6 Round 1    CONCLU… Carlton  Lauren              Brazzale           FINAL_TEAM
#>  7 Round 1    CONCLU… Carlton  Elise               O'Dea              FINAL_TEAM
#>  8 Round 1    CONCLU… Carlton  Katie               Loynes             FINAL_TEAM
#>  9 Round 1    CONCLU… Carlton  Abbie               McKay              FINAL_TEAM
#> 10 Round 1    CONCLU… Carlton  Tayla               Harris             FINAL_TEAM
#> # … with 284 more rows

Results

You can access AFL match results data from various sources. The most complete is the AFL Tables data, which includes all matches from 1897-current.

results <- fetch_match_results_afltables(1897:2019)
results
#> # A tibble: 15,614 x 16
#>     Game Date       Round Home.Team   Home.Goals Home.Behinds Home.Points
#>    <dbl> <date>     <chr> <chr>            <int>        <int>       <int>
#>  1     1 1897-05-08 R1    Fitzroy              6           13          49
#>  2     2 1897-05-08 R1    Collingwood          5           11          41
#>  3     3 1897-05-08 R1    Geelong              3            6          24
#>  4     4 1897-05-08 R1    Sydney               3            9          27
#>  5     5 1897-05-15 R2    Sydney               6            4          40
#>  6     6 1897-05-15 R2    Essendon             4            6          30
#>  7     7 1897-05-15 R2    St Kilda             3            8          26
#>  8     8 1897-05-15 R2    Melbourne            9           10          64
#>  9     9 1897-05-22 R3    Collingwood          6            5          41
#> 10    10 1897-05-22 R3    Fitzroy              5            9          39
#> # … with 15,604 more rows, and 9 more variables: Away.Team <chr>,
#> #   Away.Goals <int>, Away.Behinds <int>, Away.Points <int>, Venue <chr>,
#> #   Margin <int>, Season <dbl>, Round.Type <chr>, Round.Number <int>

While it is possible to return all historical data, it is usually good practice to only return a small amount of data - such as a single season or round - and keep your own offline database of historical data.

results_new <- fetch_results_afltables(2021)
bind_rows(results, results_new)
#> # A tibble: 15,776 x 16
#>     Game Date       Round Home.Team   Home.Goals Home.Behinds Home.Points
#>    <dbl> <date>     <chr> <chr>            <int>        <int>       <int>
#>  1     1 1897-05-08 R1    Fitzroy              6           13          49
#>  2     2 1897-05-08 R1    Collingwood          5           11          41
#>  3     3 1897-05-08 R1    Geelong              3            6          24
#>  4     4 1897-05-08 R1    Sydney               3            9          27
#>  5     5 1897-05-15 R2    Sydney               6            4          40
#>  6     6 1897-05-15 R2    Essendon             4            6          30
#>  7     7 1897-05-15 R2    St Kilda             3            8          26
#>  8     8 1897-05-15 R2    Melbourne            9           10          64
#>  9     9 1897-05-22 R3    Collingwood          6            5          41
#> 10    10 1897-05-22 R3    Fitzroy              5            9          39
#> # … with 15,766 more rows, and 9 more variables: Away.Team <chr>,
#> #   Away.Goals <int>, Away.Behinds <int>, Away.Points <int>, Venue <chr>,
#> #   Margin <int>, Season <dbl>, Round.Type <chr>, Round.Number <int>

You can get results data from other sources including AFL, Squiggle and Footywire. The default source for fetch_results() is the AFL.com.au website.

results_afl <- fetch_results(2020, round_number = 11)
results_aflw <- fetch_results(2020, comp = "AFLW")
results_squiggle <- fetch_results_squiggle(2019, round_number = 1)
results_footywire <- fetch_results_footywire(1990)

You can get AFLW results by using the comp argument.

fetch_results(2020, comp = "AFLW") %>%
  select(match.name, venue.name, round.name,
         homeTeamScore.matchScore.totalScore,
         awayTeamScore.matchScore.totalScore)
#> # A tibble: 46 x 5
#>    match.name    venue.name     round.name homeTeamScore.mat… awayTeamScore.mat…
#>    <chr>         <chr>          <chr>                   <int>              <int>
#>  1 Richmond Vs … Ikon Park      Round 1                    14                 48
#>  2 GWS Giants V… Blacktown Int… Round 1                     9                  8
#>  3 Melbourne Vs… Casey Fields   Round 1                    22                 20
#>  4 Brisbane Lio… Hickey Park    Round 1                    34                 21
#>  5 Collingwood … Victoria Park  Round 1                    38                 11
#>  6 St Kilda Vs … RSEA Park      Round 1                    14                 39
#>  7 Fremantle Vs… Fremantle Oval Round 1                    44                 28
#>  8 Western Bull… Victoria Univ… Round 2                    12                 32
#>  9 Kangaroos Vs… University of… Round 2                    37                 19
#> 10 Gold Coast S… Metricon Stad… Round 2                    33                 22
#> # … with 36 more rows

Ladder

The ladder for a particular round can be returned using fetch_ladder. Usually this only makes sense to return for one round at a time, although it is possible to return multiple rounds.

ladder <- fetch_ladder(2020, round_number = 7, comp = "AFLW") %>%
  select(season, round_name, position, 
         team.name, pointsFor, pointsAgainst, form)
ladder
#> # A tibble: 14 x 7
#>    season round_name  position team.name         pointsFor pointsAgainst form  
#>     <dbl> <chr>          <int> <chr>                 <int>         <int> <chr> 
#>  1   2020 Semi Finals        1 Kangaroos               309           136 LWWWWW
#>  2   2020 Semi Finals        2 GWS Giants              175           142 WLWLWW
#>  3   2020 Semi Finals        3 Brisbane Lions          198           185 WWDWLL
#>  4   2020 Semi Finals        4 Gold Coast Suns         154           152 LWDLLW
#>  5   2020 Semi Finals        5 Geelong Cats            211           261 LLLWWL
#>  6   2020 Semi Finals        6 Adelaide Crows          180           224 LWWLLL
#>  7   2020 Semi Finals        7 Richmond                115           322 LLLLLL
#>  8   2020 Semi Finals        1 Fremantle               277           179 WWWWWW
#>  9   2020 Semi Finals        2 Carlton                 249           164 WLWWWW
#> 10   2020 Semi Finals        3 Melbourne               204           124 WWLWWL
#> 11   2020 Semi Finals        4 Collingwood             229           149 WWLLWW
#> 12   2020 Semi Finals        5 St Kilda                154           170 LLWLLW
#> 13   2020 Semi Finals        6 Western Bulldogs        179           246 WLLLLL
#> 14   2020 Semi Finals        7 West Coast Eagles        85           265 LLLWLL

There are many variables included in the AFL.com.au ladder.

ladder <- fetch_ladder(2020, round_number = 7, comp = "AFLW")
ncol(ladder)
#> [1] 86

You can get ladder data from other sources including Squiggle and Afltables. The default source for fetch_ladder() is the AFL.com.au website.

ladder_afl <- fetch_ladder(2020, round_number = 11)
ladder_aflw <- fetch_ladder(2020, comp = "AFLW")
ladder_squiggle <- fetch_ladder_squiggle(2019, round_number = 1)
ladder_afltables <- fetch_ladder_afltables(1990)

Stats

We can return player statistics for a set of matches. The exact stats that are included varies quite a bit between data sources.

The default is again the AFL.com.au which is fairly comprehensive.

fetch_player_stats(2020, comp = "AFLW")
#> # A tibble: 1,932 x 67
#>    providerId  utcStartTime  status compSeason.shor… round.name round.roundNumb…
#>    <chr>       <chr>         <chr>  <chr>            <chr>                 <int>
#>  1 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  2 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  3 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  4 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  5 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  6 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  7 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  8 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#>  9 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#> 10 CD_M202026… 2020-02-07T0… CONCL… 2020 AFL Womens  Round 1                   1
#> # … with 1,922 more rows, and 61 more variables: venue.name <chr>,
#> #   home.team.club.name <chr>, away.team.club.name <chr>,
#> #   player.jumperNumber <int>, player.photoURL <chr>,
#> #   player.player.position <chr>, player.player.player.playerId <chr>,
#> #   player.player.player.captain <lgl>,
#> #   player.player.player.playerJumperNumber <int>,
#> #   player.player.player.givenName <chr>, player.player.player.surname <chr>,
#> #   teamId <chr>, gamesPlayed <lgl>, timeOnGroundPercentage <dbl>, goals <dbl>,
#> #   behinds <dbl>, superGoals <lgl>, kicks <dbl>, handballs <dbl>,
#> #   disposals <dbl>, marks <dbl>, bounces <dbl>, tackles <dbl>,
#> #   contestedPossessions <dbl>, uncontestedPossessions <dbl>,
#> #   totalPossessions <dbl>, inside50s <dbl>, marksInside50 <dbl>,
#> #   contestedMarks <dbl>, hitouts <dbl>, onePercenters <dbl>,
#> #   disposalEfficiency <dbl>, clangers <dbl>, freesFor <dbl>,
#> #   freesAgainst <dbl>, dreamTeamPoints <dbl>, rebound50s <dbl>,
#> #   goalAssists <dbl>, goalAccuracy <dbl>, ratingPoints <lgl>, ranking <lgl>,
#> #   lastUpdated <chr>, turnovers <dbl>, intercepts <dbl>,
#> #   tacklesInside50 <dbl>, shotsAtGoal <dbl>, goalEfficiency <lgl>,
#> #   shotEfficiency <lgl>, interchangeCounts <lgl>, scoreInvolvements <dbl>,
#> #   metresGained <dbl>, clearances.centreClearances <dbl>,
#> #   clearances.stoppageClearances <dbl>, clearances.totalClearances <dbl>,
#> #   player.playerId <chr>, player.captain <lgl>,
#> #   player.playerJumperNumber <int>, player.givenName <chr>,
#> #   player.surname <chr>, teamStatus <chr>, team.name <chr>

We also have detailed player stats courtesy of Fryzigg.

fetch_player_stats(2019, source = "fryzigg")
#> # A tibble: 9,108 x 81
#>    venue_name match_id match_home_team match_away_team match_date
#>    <chr>         <int> <chr>           <chr>           <chr>     
#>  1 MCG           15408 Carlton         Richmond        2019-03-21
#>  2 MCG           15408 Carlton         Richmond        2019-03-21
#>  3 MCG           15408 Carlton         Richmond        2019-03-21
#>  4 MCG           15408 Carlton         Richmond        2019-03-21
#>  5 MCG           15408 Carlton         Richmond        2019-03-21
#>  6 MCG           15408 Carlton         Richmond        2019-03-21
#>  7 MCG           15408 Carlton         Richmond        2019-03-21
#>  8 MCG           15408 Carlton         Richmond        2019-03-21
#>  9 MCG           15408 Carlton         Richmond        2019-03-21
#> 10 MCG           15408 Carlton         Richmond        2019-03-21
#> # … with 9,098 more rows, and 76 more variables: match_local_time <chr>,
#> #   match_attendance <int>, match_round <chr>, match_home_team_goals <int>,
#> #   match_home_team_behinds <int>, match_home_team_score <int>,
#> #   match_away_team_goals <int>, match_away_team_behinds <int>,
#> #   match_away_team_score <int>, match_margin <int>, match_winner <chr>,
#> #   match_weather_temp_c <int>, match_weather_type <chr>, player_id <int>,
#> #   player_first_name <chr>, player_last_name <chr>, player_height_cm <int>,
#> #   player_weight_kg <int>, player_is_retired <lgl>, player_team <chr>,
#> #   guernsey_number <int>, kicks <int>, marks <int>, handballs <int>,
#> #   disposals <int>, effective_disposals <int>,
#> #   disposal_efficiency_percentage <int>, goals <int>, behinds <int>,
#> #   hitouts <int>, tackles <int>, rebounds <int>, inside_fifties <int>,
#> #   clearances <int>, clangers <int>, free_kicks_for <int>,
#> #   free_kicks_against <int>, brownlow_votes <int>,
#> #   contested_possessions <int>, uncontested_possessions <int>,
#> #   contested_marks <int>, marks_inside_fifty <int>, one_percenters <int>,
#> #   bounces <int>, goal_assists <int>, time_on_ground_percentage <int>,
#> #   afl_fantasy_score <int>, supercoach_score <int>, centre_clearances <int>,
#> #   stoppage_clearances <int>, score_involvements <int>, metres_gained <int>,
#> #   turnovers <int>, intercepts <int>, tackles_inside_fifty <int>,
#> #   contest_def_losses <int>, contest_def_one_on_ones <int>,
#> #   contest_off_one_on_ones <int>, contest_off_wins <int>,
#> #   def_half_pressure_acts <int>, effective_kicks <int>,
#> #   f50_ground_ball_gets <int>, ground_ball_gets <int>,
#> #   hitouts_to_advantage <int>, hitout_win_percentage <dbl>,
#> #   intercept_marks <int>, marks_on_lead <int>, pressure_acts <int>,
#> #   rating_points <dbl>, ruck_contests <int>, score_launches <int>,
#> #   shots_at_goal <int>, spoils <int>, subbed <chr>, player_position <chr>,
#> #   date <date>

Other providers include Afltables and Footywire.

stats_afl <- fetch_player_stats(2020, round_number = 11)
stats_aflw <- fetch_player_stats(2020, source = "AFL", comp = "AFLW")
stats_footywire <- fetch_player_stats(2019, round_number = 1, source = "footywire")
stats_afltables <- fetch_player_stats_afltables(1990)

API’s

You can view how to return data from two providers using their API’s at the respective Vignettes.