Coder Social home page Coder Social logo

nflreadr's Introduction

nflreadr

CRAN status Codecov test coverage Dev status Lifecycle: stable R build status nflverse discord

nflreadr is a minimal package for downloading data from nflverse repositories. It includes caching, optional progress updates, and data dictionaries.

Please note that nflverse data repositories have been reorganized and pushed towards the nflverse-data repo, and v1.2.0+ is the minimum version that supports this change. We encourage all users to upgrade to this version immediately.

For Python access to nflverse data, please check out nfl-data-py - maintained by Cooper Adams.

Installation

Install the stable version from CRAN with:

install.packages("nflreadr")

Install the development version from GitHub with:

install.packages("nflreadr", repos = c("https://nflverse.r-universe.dev", getOption("repos")))

# or use remotes/devtools
# install.packages("remotes")
remotes::install_github("nflverse/nflreadr")

Usage

The main functions of nflreadr are prefixed with load_.

library(nflreadr)

load_pbp(2021)
#> ── nflverse play by play data ──────────────────────────────────────────────────
#> ℹ Data updated: 2022-09-27 04:35:02 PDT
#> # A tibble: 50,712 × 372
#>    play_id game_id     old_game_id home_team away_team season_type  week posteam
#>      <dbl> <chr>       <chr>       <chr>     <chr>     <chr>       <int> <chr>  
#>  1       1 2021_01_AR… 2021091207  TEN       ARI       REG             1 <NA>   
#>  2      40 2021_01_AR… 2021091207  TEN       ARI       REG             1 TEN    
#>  3      55 2021_01_AR… 2021091207  TEN       ARI       REG             1 TEN    
#>  4      76 2021_01_AR… 2021091207  TEN       ARI       REG             1 TEN    
#>  5     100 2021_01_AR… 2021091207  TEN       ARI       REG             1 TEN    
#>  6     122 2021_01_AR… 2021091207  TEN       ARI       REG             1 TEN    
#>  7     152 2021_01_AR… 2021091207  TEN       ARI       REG             1 ARI    
#>  8     181 2021_01_AR… 2021091207  TEN       ARI       REG             1 ARI    
#>  9     218 2021_01_AR… 2021091207  TEN       ARI       REG             1 ARI    
#> 10     253 2021_01_AR… 2021091207  TEN       ARI       REG             1 ARI    
#> # ℹ 50,702 more rows
#> # ℹ 364 more variables: posteam_type <chr>, defteam <chr>, side_of_field <chr>,
#> #   yardline_100 <dbl>, game_date <chr>, quarter_seconds_remaining <dbl>,
#> #   half_seconds_remaining <dbl>, game_seconds_remaining <dbl>,
#> #   game_half <chr>, quarter_end <dbl>, …

load_player_stats(2021)
#> ── nflverse player stats: offense ──────────────────────────────────────────────
#> ℹ Data updated: 2023-02-28 01:26:47 PST
#> # A tibble: 5,698 × 52
#>    player_id  player_name player_display_name position position_group
#>    <chr>      <chr>       <chr>               <chr>    <chr>         
#>  1 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  2 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  3 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  4 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  5 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  6 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  7 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  8 00-0019596 T.Brady     Tom Brady           QB       QB            
#>  9 00-0019596 T.Brady     Tom Brady           QB       QB            
#> 10 00-0019596 T.Brady     Tom Brady           QB       QB            
#> # ℹ 5,688 more rows
#> # ℹ 47 more variables: headshot_url <chr>, recent_team <chr>, season <int>,
#> #   week <int>, season_type <chr>, completions <int>, attempts <int>,
#> #   passing_yards <dbl>, passing_tds <int>, interceptions <dbl>, …

Data Sources

Data accessed by this package is stored on GitHub and can typically be found in one of the following repositories:

For a full list of functions, please see the reference page.

This data is maintained by the nflverse project team and is primarily automated via GitHub Actions. You can check the status and schedules page here: https://github.com/nflverse/nflverse-data

Configuration

The following options help configure default nflreadr behaviours.

options(nflreadr.verbose) 
# TRUE/FALSE to silence messages such as cache warnings
options(nflreadr.cache) 
# one of "memory", "filesystem", or "off"
options(nflreadr.prefer) 
# one of "qs", "rds", "parquet", or "csv"
options(nflreadr.download_path)
# defaults to current working directory - change to specify where `nflverse_download()` places data.

You can also configure nflreadr to display progress messages with the progressr package, e.g.

progressr::with_progress(load_rosters(seasons = 2010:2020))
 |========            |  40%

Getting help

The best places to get help on this package are:

Contributing

Many hands make light work! Here are some ways you can contribute to this project:

Terms of Use

The R code for this package is released as open source under the MIT License. NFL data accessed by this package belong to their respective owners, and are governed by their terms of use.

nflreadr's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nflreadr's Issues

nflreadr::clean_homeaway() does not flip `spread_line` as it does other columns

nflreadr::load_schedules(2018) |>
    dplyr::filter(game_id == "2018_01_ATL_PHI") |>
    nflreadr::clean_homeaway() |>
    dplyr::select(game_id, team, opponent, spread_line, team_spread_odds, opponent_spread_odds)
#> # A tibble: 2 × 6
#>   game_id         team  opponent spread_line team_spread_odds opponent_spread_…¹
#>   <chr>           <chr> <chr>          <dbl>            <int>              <int>
#> 1 2018_01_ATL_PHI PHI   ATL                1              101               -111
#> 2 2018_01_ATL_PHI ATL   PHI                1             -111                101
#> # … with abbreviated variable name ¹​opponent_spread_odds

Feels like spread_line should be something time team_spread and opponent_spread?

`load_pfr_passing()` should load season level adv. passing stats

This functions just loads the combined weekly passing stats instead of the adv. passing stats on season level that is given in the function documentation.

nflreadr/R/load_nflverse.R

Lines 380 to 384 in 210ed6f

load_pfr_passing <- function(seasons = TRUE){
load_pfr_advstats(seasons = seasons, stat_type = "pass")
}

It should load data from here https://www.pro-football-reference.com/years/2021/passing_advanced.htm

The data lives here: https://github.com/nflverse/pfr_scrapR/blob/master/data/pfr_advanced_passing.rds

Data documentation

It would be great if we had consistent data documentation available within the package. At the moment, the following data functions are missing in-package dictionaries:

missing fields as per comments below:

  • schedules (#94)
  • snap counts (#94)

add a get_current_week() function

perhaps @guga31bb's here?

get_current_week <- function() {

  # get season
  s <- dplyr::if_else(
    lubridate::month(lubridate::today("GMT")) >= 9,
    lubridate::year(lubridate::today("GMT")) ,
    lubridate::year(lubridate::today("GMT")) - 1
  )

  # Find the first Monday of September
  day1 <- lubridate::as_date(paste(s, "09-01", sep="-"))
  week1 <- as.POSIXlt(seq(day1, length.out=7, by="day"))
  monday1 <- week1[week1$wday == 1]

  # NFL season starts 4 days later
  first_game <- (monday1 + lubridate::days(3)) %>% lubridate::as_date()

  # current week number of NFL season is 1 + how many weeks have elapsed since first game
  # (ie first game is week 1)
  current_week <- 1 +
    lubridate::days(lubridate::today("America/New_York") - first_game) %>%
    .$day %>%
    magrittr::divide_by_int(7)

  if (current_week <= 1) current_week <- 1

  return(current_week)

}

How to re-export nflreadr functions in other packages

I have played around with possibilities to re-export nflreadr functions as well as their documentation.

Currently I am seeing two possible ways.

The first variant is very robust to changes in function arguments in nflreadr because it just directly exports the function. However, the documentation inside the package that re-exports this function will look like the below picture

#' Bla bla
#'
#' See [nflreadr::load_nextgen_stats()] for details.
#'
#' @name load_nextgen_stats
#' @importFrom nflreadr load_nextgen_stats
#'
#' @export
NULL

grafik

The second variant will feature much better documentation inside the package that re-exports and it is possible to use another name. However, it requires an update if the function arguments in nflreadr change.

#' @inherit nflreadr::load_nextgen_stats
#' @export
# Need to add own examples especially when the function name is different
#' @examples
#' ngs_loader(2020, "passing")
ngs_loader <- function(seasons = NULL,
                       stat_type = c("passing", "receiving", "rushing"),
                       file_type = getOption("nflreadr.prefer", default = "qs")){

  nflreadr::load_nextgen_stats(seasons = seasons,
                               stat_type = stat_type,
                               file_type = file_type)
}

grafik

I tend to use the second variant with the much better function documentation tbh. Since major changes in function params are unlikely it should be ok to require updates in the other packages. We should probably add a vignette that explains this

Possibly changing pfr_id?

I posted a tweet today and someone pointed out that I wasn't showing any NFL snaps for Trey Pipkins of the Chargers. The graph is created only using load_draft_picks() and load_snap_counts() which I assumed were both picking up the PFR ID from the page where the data was scraped. However, it looks like the draft data shows a different ID for Pipkins than any of the other PFR data sources, including the present-day 2019 draft page!

My guess is that Pipkins ID changed from PipkTr01 to PipkTr00 between the time that the draft data and snap data were scraped. The PFR player page for PipkTr01 is still live, but has no info other than name an position.

Similarly, someone said the same of Broncos rookie OL Quinn Meinerz in something I posted earlier in the year that used pfr_id from load_rosters(). I assumed this was just an issue of some players not yet being mapped in the crosswalk yet. I looked into that case just now, and Meinerz appears to have had as many as three PFR IDs already. MeinQu02 is the current ID being used for him, but MeinQu01 looks like it is referencing him as well. As you can see in my code below, the draft table picked up MeinQu00, which has more info than MeinQu01 and is unquestionably a reference to Meinerz.

Apologies if I'm rehashing a known issue, but I was just really surprised to see all of this. The obvious short term fix is to re-scrape all of the draft tables as they contain the newest set of IDs. I'm not sure what the solution is if some players just get new IDs created for them from time to time. Let me know if I can be of any assistance!

library(tidyverse)

## Trey Pipkins
nflreadr::load_draft_picks(2019) %>% 
  filter(pfr_name == 'Trey Pipkins')
#> # A tibble: 1 x 10
#>   season team  round  pick pfr_id   pfr_name   player_id side  category position
#>    <int> <chr> <int> <int> <chr>    <chr>      <chr>     <chr> <chr>    <chr>   
#> 1   2019 LAC       3    91 PipkTr01 Trey Pipk~ ""        O     OL       T

nflreadr::load_snap_counts(2019) %>% 
  filter(player == 'Trey Pipkins') %>% 
  select(game_id, pfr_player_id, player, offense_snaps)
#> # A tibble: 13 x 4
#>    game_id         pfr_player_id player       offense_snaps
#>    <chr>           <chr>         <chr>                <dbl>
#>  1 2019_01_IND_LAC PipkTr00      Trey Pipkins             0
#>  2 2019_02_LAC_DET PipkTr00      Trey Pipkins             0
#>  3 2019_03_HOU_LAC PipkTr00      Trey Pipkins             0
#>  4 2019_04_LAC_MIA PipkTr00      Trey Pipkins             6
#>  5 2019_05_DEN_LAC PipkTr00      Trey Pipkins             0
#>  6 2019_06_PIT_LAC PipkTr00      Trey Pipkins             0
#>  7 2019_07_LAC_TEN PipkTr00      Trey Pipkins             1
#>  8 2019_08_LAC_CHI PipkTr00      Trey Pipkins             0
#>  9 2019_09_GB_LAC  PipkTr00      Trey Pipkins             2
#> 10 2019_10_LAC_OAK PipkTr00      Trey Pipkins            68
#> 11 2019_11_KC_LAC  PipkTr00      Trey Pipkins            79
#> 12 2019_16_OAK_LAC PipkTr00      Trey Pipkins            32
#> 13 2019_17_LAC_KC  PipkTr00      Trey Pipkins            63

nflreadr::load_rosters(2019) %>% 
  filter(full_name == 'Trey Pipkins') %>% 
  select(season, team, full_name, pfr_id)
#> # A tibble: 1 x 4
#>   season team  full_name    pfr_id  
#>    <dbl> <chr> <chr>        <chr>   
#> 1   2019 LAC   Trey Pipkins PipkTr00

## Quinn Meinerz
nflreadr::load_draft_picks(2021) %>% 
  filter(pfr_name == 'Quinn Meinerz')
#> # A tibble: 1 x 10
#>   season team  round  pick pfr_id   pfr_name   player_id side  category position
#>    <int> <chr> <int> <int> <chr>    <chr>      <chr>     <chr> <chr>    <chr>   
#> 1   2021 DEN       3    98 MeinQu00 Quinn Mei~ ""        O     OL       C

nflreadr::load_snap_counts(2021) %>% 
  filter(player == 'Quinn Meinerz') %>% 
  select(game_id, pfr_player_id, player, offense_snaps)
#> # A tibble: 2 x 4
#>   game_id         pfr_player_id player        offense_snaps
#>   <chr>           <chr>         <chr>                 <dbl>
#> 1 2021_03_NYJ_DEN MeinQu02      Quinn Meinerz            25
#> 2 2021_04_BAL_DEN MeinQu02      Quinn Meinerz            61

nflreadr::load_rosters(2021) %>% 
  filter(full_name == 'Quinn Meinerz') %>% 
  select(season, team, full_name, pfr_id)
#> # A tibble: 1 x 4
#>   season team  full_name     pfr_id
#>    <dbl> <chr> <chr>         <chr> 
#> 1   2021 DEN   Quinn Meinerz <NA>

Investigate data type consistency across files

Where should this responsibility live?

  • at the nflreadr level? (nflreadr coerces types)
  • at the file maintainer level?

Seems to be best at the file maintainer level so that type is set once and not when the user runs it...but that means we need to come up with an nflverse-level type dictionary

Relatedly: column naming (player? player_name? full_player_name? name? merge_name) etc

Faile to download headshot_url

Describe the bug

Error in download_url(path) : 
  Failed to download https://a.espncdn.com/combiner/i?img=/i/headshots/nfl/players/full/4036651.png (HTTP 415)

This headshot_url is from Kindle Vildor (DB) from Chicago Bears

nflverse-data URL updates

  • load_pbp
  • load_player_stats
  • load_combine
  • load_nextgen_stats
  • load_snap_counts
  • load_pfr_advstats week
  • load_pfr_advstats season
  • load_rosters
  • load_injuries
  • load_depth_charts

pbp data dictionary improvement

description of return_ (yards, touchdowns, player, team) etc should include a note that return = interception, fumble, kick block, kickoff return, punt return etc.

Duplicated entries in load_rosters() 2021

Describe the bug
When loading 2021 rosters, a variation of headshot_url makes multiple rows per player/season combination

Reprex

library(magrittr)
nflreadr::load_rosters(2021) %>% 
  dplyr::filter(full_name == 'Joe Mixon') %>% 
  dplyr::pull(headshot_url)
#> [1] "https://static.www.nfl.com/image/private/f_auto,q_auto/league/a577vg2x4mz7eqo0ejsw"
#> [2] "https://static.www.nfl.com/image/private/f_auto,q_auto/league/pucb3gqsidpxzivdp9hd"

Created on 2021-10-01 by the reprex package (v2.0.1)

Expected behavior
This in turn creates duplicates when merging so I was curious if it was intentional

Session information

R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Missing Data

Hi,

Thanks for the package; a lot of work went into this and I appreciate it. But, I consistently find missing data within each of the data assets. For instance:

nflreadr::load_nextgen_stats(stat_type = "receiving",seasons = 2021) %>% filter(player_display_name == "Allen Robinson")

will only show 5 entries for regular season games.....he's played in 9 games to-date.

nflreadr::load_ff_rankings(type ="week") won't have all the players. In fact, the entire ATL defense is missing. While other teams have their full rosters.

The rosters is not complete either nor is the injury list valid. I've checked it with other injury lists. I'd like to use it b/c it shows defensive players injuries where my other sources don't but the quality isn't there.

These are just small examples. But I've found hundreds of examples.

How is the data quality checked? Is there anything we can do to remedy the inconsistencies?

load_rosters creates tibbles with different columns for different seasons, preventing mulitple years from being retrieved at once.

If I attempt to load multiple seasons of player rosters using load_rosters I get the following error:

> nfl_rosters <- load_rosters(seasons=c(1999:2021))

Error in data.table::rbindlist(out, use.names = TRUE) :
Item 4 has 30 columns, inconsistent with item 1 which has 24 columns. To fill missing columns use fill=TRUE.

This is because different seasons have different columns present in the output from load_rosters. For example, some earlier seasons are missing the column sleeper_id and other id columns, while later seasons do have these columns available. I believe this can be resolved by adding a fill = TRUE in the rbind function that is within the load_from_url function that load_rosters calls.

Add option or arguments to silence nflreadr warnings

I am using nflreadr::clean_team_abbrs() in different places and the warning for non matched abbreviations is a bit annoying.

It's probably worth it to add an argument to that function or maybe a global option to silence all nflreadr warnings?

Release nflreadr 1.1.1

Prepare for release:

Submit to CRAN:

  • usethis::use_version('patch')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()

Unable to load_pbp() with qs::qdeserialize()

Describe the bug
A clear and concise description of what the bug is.

Reprex

nflreadr::load_pbp(2020)
#> Warning: Failed to parse file with qs::qdeserialize() from <https://github.com/
#> nflverse/nflfastR-data/raw/master/data/play_by_play_2020.qs>
#> # A tibble: 0 x 0

Created on 2021-08-09 by the reprex package (v2.0.0)

Expected behavior
Issue trying to load in pbp_data. This was my first attempt at using nflreadr.

Session information

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19042)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] ps_1.6.0          digest_0.6.27     withr_2.4.2       magrittr_2.0.1   
#>  [5] reprex_2.0.0      evaluate_0.14     highr_0.9         stringi_1.6.2    
#>  [9] rlang_0.4.11      cli_2.5.0         rstudioapi_0.13   fs_1.5.0         
#> [13] rmarkdown_2.8     tools_4.1.0       stringr_1.4.0     glue_1.4.2       
#> [17] xfun_0.23         yaml_2.2.1        compiler_4.1.0    htmltools_0.5.1.1
#> [21] knitr_1.33

Created on 2021-08-09 by the reprex package (v2.0.0)

First time submitting an issue, so apologies if I am missing anything but happy to include whatever else is needed.

Explicitly naming argument `season` breaks `load_schedules`

nflreadr::load_schedules(2020)
#> # A tibble: 269 × 41
#>    game_id  season game_type  week gameday weekday gametime away_team away_score
#>    <chr>     <int> <chr>     <int> <chr>   <chr>   <chr>    <chr>          <int>
#>  1 2020_01…   2020 REG           1 2020-0… Thursd… 20:20    HOU               20
#>  2 2020_01…   2020 REG           1 2020-0… Sunday  13:00    SEA               38
#>  3 2020_01…   2020 REG           1 2020-0… Sunday  13:00    CLE                6
#>  4 2020_01…   2020 REG           1 2020-0… Sunday  13:00    NYJ               17
#>  5 2020_01…   2020 REG           1 2020-0… Sunday  13:00    LV                34
#>  6 2020_01…   2020 REG           1 2020-0… Sunday  13:00    CHI               27
#>  7 2020_01…   2020 REG           1 2020-0… Sunday  13:00    IND               20
#>  8 2020_01…   2020 REG           1 2020-0… Sunday  13:00    GB                43
#>  9 2020_01…   2020 REG           1 2020-0… Sunday  13:00    MIA               11
#> 10 2020_01…   2020 REG           1 2020-0… Sunday  13:00    PHI               17
#> # … with 259 more rows, and 32 more variables: home_team <chr>,
#> #   home_score <int>, location <chr>, result <int>, total <int>,
#> #   overtime <int>, old_game_id <chr>, espn <chr>, away_rest <int>,
#> #   home_rest <int>, away_moneyline <int>, home_moneyline <int>,
#> #   spread_line <dbl>, away_spread_odds <int>, home_spread_odds <int>,
#> #   total_line <dbl>, under_odds <int>, over_odds <int>, div_game <int>,
#> #   roof <chr>, surface <chr>, temp <int>, wind <int>, away_qb_id <chr>, …

nflreadr::load_schedules(seasons == 2020)
#> Error in isTRUE(seasons): object 'seasons' not found

Created on 2021-08-25 by the reprex package (v2.0.1)

Release nflreadr 1.1.3

Prepare for release:

  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

old_game_id not matching with NextGenStats gameId

I started scraping some of the NextGenStats Game Center data following Week 1 and my code to refresh for Weeks 2-3 does not appear to join correctly because the old_game_id for nflreadr doesn't match what I'm getting from NGS. I'm not sure if I should share the code I'm using to access the NextGenStats endpoint on here (because I need to use a cookie and I don't know if I should be sharing that kind of thing?), but I can privately if that helps.

Here is what I found in nflreadr versus this NGS endpoint https://nextgenstats.nfl.com/api/league/schedule?season=2021 on this page

library(tidyverse)
library(nflreadr)

load_schedules(2021) %>% 
  filter(old_game_id == 2021092602) %>% 
  select(game_id, old_game_id)
#> # A tibble: 1 x 2
#>   game_id        old_game_id
#>   <chr>          <chr>      
#> 1 2021_03_LAC_KC 2021092602

VS
image

And here are the differences I have for Week 3's games:
image

Let me know if I can provide additional code or context. Thanks!

Release nflreadr 1.1.2

Prepare for release:

Submit to CRAN:

  • usethis::use_version('patch')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()

load_snap_counts has 2013 hardcoded as the earliest season, despite data for 2012 existing in the repository.

In load_snap_counts, the year 2013 is hard coded as the earliest season for which snap count data is available, despite snap count data existing for the year 202 in https://github.com/nflverse/nflverse-data/releases.

Here is where the year 2013 is hard coded:
if(isTRUE(seasons)) seasons <- 2013:most_recent_season()
stopifnot(is.numeric(seasons), seasons >= 2013, seasons <= most_recent_season())

I believe changing 2013 to 2012 will resolve the issue and allow 2012 data to be retrieved using load_snap_counts

Release nflreadr 1.2.0

Prepare for release:

  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

feature: create aliases for common seasons?

e.g.

fn <- \(season) {
  if(!is.character(season) return(season)

  switch(season,
    "all" = TRUE,
    "pbp" = 1999:nflreadr::most_recent_season(), # pbp era
    "air" = 2006:nflreadr::most_recent_season(), # air yards era
    "ngs" = 2016:nflreadr::most_recent_season() # ngs era
  )

?

Release nflreadr 1.1.0

Prepare for release:

  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

other features/assorted to-do

  • clear cache function
  • validate options(nflreadr.prefer) .onAttach
  • improve documentation for raw_from_url (i.e. explain intended usage of piping to read_parquet etc)
  • warning inside qs_from_url's error message if fail, perhaps also http status code?
  • add more boilerplate to readme
  • dynastyprocess/data downloads
  • add codecov
  • data dictionary + vignette for rosters
  • add progressr to DESC
  • data dictionary + vignette for schedule
  • hex logo
  • team colours/logos import function
  • readme- fix URL
  • Document how to use progressr with the raw from_url functions
  • data dictionary + vignette for nextgen stats
  • load_injuries #14
  • load_depth_charts #15
  • load_trades #18
  • load_espn_qbr() #19
  • load_snap_counts c/o Ben #17
  • load_pfr_adv_passing #17
  • load_ff_rankings

Release nflreadr 1.0.0

First release:

Prepare for release:

  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('major')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Update install instructions in README
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

load_teams has a mispelling in team_division

The column label for nflreadr::load_teams() for team_division is misspelled as team_divison.

nflreadr::load_teams() |>
  dplyr::select(team_divison) |> 
  names()
#> [1] "team_divison"

Created on 2021-09-03 by the reprex package (v2.0.0)

packageVersion("nflreadr")
[1] ‘1.0.0.5’

unknown nfl players in load_participation data

Describe the bug
Unknown, or (unexpected?) NFL players in pass snap data e.g. Al Riles, DeVonte Dedmon, Eldridge Massington (sweet name)

Code:
participation <- load_participation(2021)

df2 <- participation %>%
separate_rows(offense_players, sep = ";")

rosters <- load_rosters(2021) %>%
select(position, full_name, gsis_id, team) %>%
filter(position == "WR")

pbp <- load_pbp(2021)

df3 <- pbp %>%
left_join(df2, by = c('old_game_id' = 'old_game_id', 'play_id' = 'play_id')) %>%
filter(play_type == "pass",sack == 0, week <= 18) %>%
select(posteam, desc, air_yards, yards_after_catch, epa, receiver_player_id, receiver_player_name, offense_players) %>%
left_join(rosters, by = c('offense_players' = 'gsis_id'))

routes <- df3 %>%
group_by(full_name, team) %>%
summarise(pass_snaps = n()) %>%
drop_na(full_name) %>%
ungroup()

Session information

sessionInfo()
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.utf8 LC_CTYPE=English_Australia.utf8 LC_MONETARY=English_Australia.utf8
[4] LC_NUMERIC=C LC_TIME=English_Australia.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] nflplotR_1.0.1.9003 nflreadr_1.2.0.17 nfl4th_1.0.1.9000 nflseedR_1.1.0 nflfastR_4.3.0.9020 nflverse_1.0.2
[7] ggpmisc_0.4.6 ggpp_0.4.4 Pstat_1.2 hrbrthemes_0.8.0 ggpubr_0.4.0 forcats_0.5.1
[13] purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 tibble_3.1.7 ggplot2_3.3.6 tidyverse_1.3.1
[19] dplyr_1.0.9 rlang_1.0.2 stringr_1.4.0 ffscrapr_1.4.7

loaded via a namespace (and not attached):
[1] nlme_3.1-157 fs_1.5.2 fastrmodels_1.0.2 lubridate_1.8.0 httr_1.4.3 tools_4.2.0 backports_1.4.1
[8] utf8_1.2.2 R6_2.5.1 DBI_1.1.2 mgcv_1.8-40 colorspace_2.0-3 withr_2.5.0 tidyselect_1.1.2
[15] curl_4.3.2 compiler_4.2.0 extrafontdb_1.0 progressr_0.10.0 cli_3.3.0 rvest_1.0.2 quantreg_5.93
[22] SparseM_1.81 xml2_1.3.3 labeling_0.4.2 scales_1.2.0 systemfonts_1.0.4 digest_0.6.29 rmarkdown_2.14
[29] pkgconfig_2.0.3 htmltools_0.5.2 parallelly_1.31.1 extrafont_0.18 dbplyr_2.1.1 fastmap_1.1.0 readxl_1.4.0
[36] rstudioapi_0.13 generics_0.1.2 farver_2.1.0 jsonlite_1.8.0 car_3.0-13 magrittr_2.0.3 polynom_1.4-1
[43] Matrix_1.4-1 Rcpp_1.0.8.3 munsell_0.5.0 fansi_1.0.3 proto_1.0.0 abind_1.4-5 gdtools_0.2.4
[50] furrr_0.3.0 lifecycle_1.0.1 stringi_1.7.6 snakecase_0.11.0 carData_3.0-5 MASS_7.3-56 grid_4.2.0
[57] parallel_4.2.0 listenv_0.8.0 crayon_1.5.1 lattice_0.20-45 haven_2.5.0 splines_4.2.0 hms_1.1.1
[64] knitr_1.39 pillar_1.7.0 ratelimitr_0.4.1 ggsignif_0.6.3 xgboost_1.6.0.1 codetools_0.2-18 reprex_2.0.1
[71] glue_1.6.2 evaluate_0.15 data.table_1.14.2 modelr_0.1.8 vctrs_0.4.1 tzdb_0.3.0 MatrixModels_0.5-0
[78] Rttf2pt1_1.3.10 cellranger_1.1.0 gtable_0.3.0 gsubfn_0.7 future_1.25.0 assertthat_0.2.1 cachem_1.0.6
[85] xfun_0.30 janitor_2.1.0 broom_0.8.0 rstatix_0.7.0 survival_3.3-1 memoise_2.0.1 globals_0.14.0
[92] ellipsis_0.3.2

image
m here.

Issue with load_rosters

Rosters appear to still be reflecting roster prior to draft and NFL Free Agency. for example, Chase Edmonds and Christian Kirk still appear to be active on ARI. Looking at the load_rosters function definition for nflreadr it looks like the api was last ran on 3/10/2022?

Screenshot should be attached

rosters_2022 <- filter(load_rosters(),position %in% c("QB","RB","WR","TE"))
view(rosters_2022)

Screen Shot 2022-06-05 at 11 36 57 AM

Name Cleaning not working as intended

It appears that the nflreadr::clean_player_names does not work as intended when there is whitespace after a suffix. Should the whitespace be trimmed prior to separating?

# Issue Example
nflreadr::clean_player_names(c("A.J.    Green", "Odell Beckham Jr.   ", "  Le'Veon Bell Sr."))
#> [1] "AJ Green"         "Odell Beckham Jr" "LeVeon Bell"

# As Intended
nflreadr::clean_player_names(c("A.J. Green", "Odell Beckham Jr.", "Le'Veon Bell Sr."))
#> [1] "AJ Green"      "Odell Beckham" "LeVeon Bell"

Release nflreadr 1.3.0

Wrap up features for 1.3.0

  • finalize final file formats for each nflverse/nflverse-data release (should everything get a csv + rds + parquet + qs equivalent?)
    • pfr stats, combine, draft picks, snap counts done
    • rosters, injuries, depth charts
    • pbp, player_stats, player_stats_kicking
    • contracts
    • nextgen_stats
  • test fix nflreadr
  • load_draft_picks
  • load_officials
  • load_players
  • load_participation #119
  • run dictionary checks?
  • add otc id to rosters df?

Prepare for release:

Submit to CRAN:

  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()

Add `nflreadr::report()`

Introduces two sub-dependencies: cli and glue which are lightweight enough. Useful for ffverse pkgs which may not have nflfastR installed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.