Coder Social home page Coder Social logo

reconhub / aweek Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 4.0 376 KB

Convert dates to arbitrary week definitions :calendar:

Home Page: https://www.repidemicsconsortium.org/aweek

License: Other

R 100.00%
r r-package timeseries dates date-formatting week epidemiology

aweek's Introduction

Build Status Build status Coverage Status CRAN_Status_Badge

reconhub

This packages installs and loads all stable RECON packages similiar to the tidyverse package.

Installing the package

To install the current stable, CRAN version of the package, type:

install.packages("reconhub")

To benefit from the latest features and bug fixes, install the development, github version of the package using:

devtools::install_github("reconhub/reconhub")

Note that this requires the package devtools installed.

What does it do?

# attaches all stable recon packages
library(reconhub)
## Attaching package epicontacts

## Attaching package outbreaks

## Attaching package incidence

Also, you can install all development versions of RECON packages:

reconhub::install_dev_versions()

Getting help online

Bug reports and feature requests should be posted on github using the issue system. All other questions should be posted on the RECON forum:
http://www.repidemicsconsortium.org/forum/

Contributions are welcome via pull requests.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

aweek's People

Contributors

stephlocke avatar zkamvar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

aweek's Issues

Test failing every seven days

It was a bad idea to use Sys.Date() for a test 😞

Version: 0.1.0
Check: tests
Result: ERROR
Running β€˜testthat.R’
Running the tests in β€˜tests/testthat.R’ failed.
Complete output:
> library(testthat)
> library(aweek)
>
> test_check("aweek")
── 1. Failure: aweek can be converted to POSIXlt (@test_conversion.R#45) ──────
Expectation did not fail

 ══ testthat results ═══════════════════════════════════════════════════════════
 OK: 31 SKIPPED: 1 FAILED: 1
 1. Failure: aweek can be converted to POSIXlt (@test_conversion.R#45)

 Error: testthat unit tests failed
 Execution halted

Flavors: r-devel-linux-x86_64-fedora-clang, r-devel-linux-x86_64-fedora-gcc, r-patched-solaris-x86

Remove ambiguous behavior (breaking changes)

Things that should be changed:

  1. behavior of factors
  2. combining aweek objects

After thinking about it (with a bit of help from Rich FitzJohn), I think adding the ability to combine aweek objects with different start days was a mistake. The behavior of simply taking the first element and converting it was a mistake because it will not be obvious to the user what is happening. Combining aweek objects should do the following:

  1. only combine aweek objects with the same week_start attribute
    i. if they don't have the same week_start attribute, an error is thrown, advising the use of change_week_start()
  2. if combining a factor, it is converted to a character, no matter what. The user can use factor_aweek() (or something like that) to change it.
  3. truncated aweek objects are no longer truncated.

Other than that, character and date objects are trivial

Unexpected behaviour when filtering by week

I've noticed some unusual/unexpected behaviour when trying to filter by multiple weeks.

library(tidyverse)
library(aweek)
df <- 
  tibble(
    x = rnorm(61),
    date = Sys.Date() + (-30:30)
  ) %>% 
  mutate(
    epiweek = date2week(date, floor_day = TRUE)
  )

df
#> # A tibble: 61 x 3
#>          x date       epiweek 
#>      <dbl> <date>     <aweek> 
#>  1  0.932  2019-06-22 2019-W25
#>  2 -0.108  2019-06-23 2019-W25
#>  3  0.443  2019-06-24 2019-W26
#>  4 -0.0255 2019-06-25 2019-W26
#>  5  0.586  2019-06-26 2019-W26
#>  6  0.268  2019-06-27 2019-W26
#>  7 -0.783  2019-06-28 2019-W26
#>  8  0.682  2019-06-29 2019-W26
#>  9  0.790  2019-06-30 2019-W26
#> 10  0.700  2019-07-01 2019-W27
#> # … with 51 more rows

df %>% filter(epiweek %in% c("2019-W25", "2019-W26"))
#> # A tibble: 9 x 3
#>         x date       epiweek 
#>     <dbl> <date>     <aweek> 
#> 1  0.932  2019-06-22 2019-W25
#> 2 -0.108  2019-06-23 2019-W25
#> 3  0.443  2019-06-24 2019-W26
#> 4 -0.0255 2019-06-25 2019-W26
#> 5  0.586  2019-06-26 2019-W26
#> 6  0.268  2019-06-27 2019-W26
#> 7 -0.783  2019-06-28 2019-W26
#> 8  0.682  2019-06-29 2019-W26
#> 9  0.790  2019-06-30 2019-W26

first_week <- date2week("2019-06-22", floor_day = TRUE)
last_week <- date2week("2019-06-24", floor_day = TRUE)

df %>% filter(epiweek %in% first_week)
#> # A tibble: 2 x 3
#>        x date       epiweek 
#>    <dbl> <date>     <aweek> 
#> 1  0.932 2019-06-22 2019-W25
#> 2 -0.108 2019-06-23 2019-W25
df %>% filter(epiweek %in% last_week)
#> # A tibble: 7 x 3
#>         x date       epiweek 
#>     <dbl> <date>     <aweek> 
#> 1  0.443  2019-06-24 2019-W26
#> 2 -0.0255 2019-06-25 2019-W26
#> 3  0.586  2019-06-26 2019-W26
#> 4  0.268  2019-06-27 2019-W26
#> 5 -0.783  2019-06-28 2019-W26
#> 6  0.682  2019-06-29 2019-W26
#> 7  0.790  2019-06-30 2019-W26
df %>% filter(epiweek %in% c(first_week, last_week))
#> # A tibble: 0 x 3
#> # … with 3 variables: x <dbl>, date <date>, epiweek <aweek>

Created on 2019-07-22 by the reprex package (v0.3.0)

As you can see, the first few filters work, but not when combined in c(), but this would be really helpful (essential) to work with weeks in combination

Allow for arithmetic manipulation of `aweek` objects

It would be great if one could work with aweek objects, e.g. by adding and subtracting to get the precedent or subsequent weeks, akin to what is possible with dates. Something like

aweek::date2week(Sys.Date()) + 1

Add methods to be consistent with `Date` class behavior

methods(class = "Date")
#>  [1] -             [             [[            [<-           +            
#>  [6] as.character  as.data.frame as.list       as.POSIXct    as.POSIXlt   
#> [11] Axis          c             coerce        cut           diff         
#> [16] format        hist          initialize    is.numeric    julian       
#> [21] length<-      Math          mean          months        Ops          
#> [26] pretty        print         quarters      rep           round        
#> [31] seq           show          slotsFromS3   split         str          
#> [36] summary       Summary       trunc         weekdays      weighted.mean
#> [41] xtfrm        
#> see '?methods' for accessing help and source code

Created on 2019-04-10 by the reprex package (v0.2.1)

`factor_aweek()` has fencpost issue

This largely has to do with how we want to handle seq.Date

library("aweek")
factor_aweek(as.aweek(c("2019-W08-7", "2019-W11-1")))
#> <aweek start: Monday>
#> [1] 2019-W08 <NA>    
#> Levels: 2019-W08 2019-W09 2019-W10

Created on 2019-06-10 by the reprex package (v0.3.0)

Add data.frame.aweek method

Currently, it's only possible to create aweek data frames by inserting the column. Creating the data.frame.aweek() method would allow users to create data frames on the fly with aweek columns:

library('aweek')
d <- as.Date("2011-02-08") + 1:10
w <- date2week(d)
data.frame(w, d)
#> Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors): cannot coerce class '"aweek"' to a data.frame
df <- data.frame(d)
df$w <- w
df
#>             d          w
#> 1  2011-02-09 2011-W06-3
#> 2  2011-02-10 2011-W06-4
#> 3  2011-02-11 2011-W06-5
#> 4  2011-02-12 2011-W06-6
#> 5  2011-02-13 2011-W06-7
#> 6  2011-02-14 2011-W07-1
#> 7  2011-02-15 2011-W07-2
#> 8  2011-02-16 2011-W07-3
#> 9  2011-02-17 2011-W07-4
#> 10 2011-02-18 2011-W07-5
# Tibbles work
tibble::tibble(w = w, d = d)
#> # A tibble: 10 x 2
#>    w           d         
#>    <S3: aweek> <date>    
#>  1 2011-W06-3  2011-02-09
#>  2 2011-W06-4  2011-02-10
#>  3 2011-W06-5  2011-02-11
#>  4 2011-W06-6  2011-02-12
#>  5 2011-W06-7  2011-02-13
#>  6 2011-W07-1  2011-02-14
#>  7 2011-W07-2  2011-02-15
#>  8 2011-W07-3  2011-02-16
#>  9 2011-W07-4  2011-02-17
#> 10 2011-W07-5  2011-02-18

Created on 2019-03-11 by the reprex package (v0.2.1)

Disallow factors without floor_day

I think it's a good idea to restrict the behavior here so that someone doesn't end up with 6,000+ factor levels if they accidentally mis-specify a decade.

Allow for varying first week of the year

This package conveniently handles week to date and date to week conversions with varying definitions of the first day in each week.

In epidemiology we are also confronted with varying definitions of week 1 in a year. The ISO norm says: "week 01 is the week with the Gregorian year's first Thursday in it", the week before being week 52 or 53 of the previous year.
https://en.wikipedia.org/wiki/ISO_week_date#First_week

There is however situations in which week 1 is defined as the week containing 1st January, which is not always the same as above. Other definitions are also possible, e.g. the week containing the first Monday, Tuesday, ... Sunday of a year.

It would be great to add an option to choose which of these definitions should be used, with a default to the definition used by ISO.

Integrating with the tsibble package

Hey - know you are swamped, but just thought I would flag this here as an fyi.

Might be worth integrating aweek in to the tsibble package. Have posted an issue on the tsibble page too.

It's a pretty functional package (with a yearweek function) - have been using it for time series analysis.

This would also address @jpolonsky issues (1 and 2) - because with tsibble can do both those in regular dplyr syntax.

Long shot but would also make the incidence package a lot simpler (i think)

Anyway - food for thought nothing actionable!

have a good one

no warning (or error) with impossible dates

When passing a character as a date to date2week it seems like only the last two numbers (understandably) are read in the presumed day field. Seems like at a minimum a warning should be thrown to indicate that this conversion is assuming that the characters are in Y M D format. If more than two digits are picked up in the month or day position, perhaps an error or warning should also be thrown?

`
R> date2week("1910/12/11",week_start="saturday")

[1] "1910-W50-2"

R> date2week("1910/12/111",week_start="saturday")

[1] "1910-W50-2"

R> date2week("1910/12/1111",week_start="saturday")

[1] "1910-W50-2"
`

Error in dates with NA?

hey - @epiamsterdam trying to run cholera template on data from mozambique.
Runs in to this error, which seems to be beacuse of trying to make factor levels??

> linelist_cleaned$epiweek <- aweek::date2week(linelist_cleaned$date_of_onset, 
                                             week_start = "Monday", 
                                              floor_day = TRUE, 
                                              factor = TRUE)
Error in charToDate(x) : 
  character string is not in a standard unambiguous format

It is a date and seems normal, but has NAs:

> class(linelist_cleaned$date_of_onset)  
[1] "Date"` 
`> head(linelist_cleaned$date_of_onset, 20)
 [1] "2019-04-08" "2019-04-08" "2019-04-08" "2019-04-09" "2019-04-03" "2019-04-03" "2019-04-04" "2019-04-04"
 [9] "2019-04-04" "2019-04-04" NA           NA           NA           "2019-04-09" "2019-04-09" "2019-04-08"
[17] NA           NA           NA           NA 

The NAs seem to be recognised normally though:

> table(is.na(linelist_cleaned$date_of_onset))
FALSE  TRUE 
 2903  1408 

If you run date2week, without factor = TRUE, it works, but produces funky looking NAs

> aweek::date2week(linelist_cleaned$date_of_onset, 
                                              week_start = "Monday", 
                                              floor_day = TRUE )
<aweek start: Monday>
   [1] "2019-W15"    "2019-W15"    "2019-W15"    "2019-W15"    "2019-W14"    "2019-W14"    "2019-W14"   
   [8] "2019-W14"    "2019-W14"    "2019-W14"    "  NA-WNA-NA" "  NA-WNA-NA" "  NA-WNA-NA" "2019-W15"   
  [15] "2019-W15"    "2019-W15"    "  NA-WNA-NA" "  NA-WNA-NA" "  NA-WNA-NA" "  NA-WNA-NA" "  NA-WNA-NA"

Not sure where to go from here.... session info below just incase

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2.2     ggspatial_1.0.3    sf_0.7-2           epitools_0.5-10    aweek_0.2.0       
 [6] incidence_1.7.0    sitrep_0.1.0       summarytools_0.8.7 ggplot2_3.1.0      dplyr_0.7.8       
[11] knitr_1.20        

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         lubridate_1.7.4    lattice_0.20-35    tidyr_0.8.3        class_7.3-14      
 [6] assertthat_0.2.0   digest_0.6.18      utf8_1.1.4         R6_2.4.0           cellranger_1.1.0  
[11] plyr_1.8.4         survey_3.33-2      e1071_1.6-8        srvyr_0.3.3        pillar_1.3.1      
[16] rlang_0.3.1        lazyeval_0.2.1     curl_3.2           readxl_1.1.0       rstudioapi_0.8    
[21] data.table_1.11.2  Matrix_1.2-14      splines_3.5.0      stringr_1.4.0      foreign_0.8-70    
[26] pander_0.6.1       ISOweek_0.6-2      RCurl_1.95-4.10    munsell_0.5.0      compiler_3.5.0    
[31] pkgconfig_2.0.2    htmltools_0.3.6    tidyselect_0.2.5   tibble_2.0.1       rio_0.5.10        
[36] codetools_0.2-15   matrixStats_0.54.0 fansi_0.4.0        crayon_1.3.4       withr_2.1.2       
[41] bitops_1.0-6       grid_3.5.0         spData_0.2.8.3     lwgeom_0.1-5       gtable_0.2.0      
[46] DBI_1.0.0          magrittr_1.5       units_0.6-2        scales_1.0.0       cli_1.0.1         
[51] stringi_1.3.1      pryr_0.1.4         rapportools_1.0    cowplot_0.9.2      openxlsx_4.0.17   
[56] tools_3.5.0        forcats_0.3.0      epitrix_0.2.1      glue_1.3.0         purrr_0.3.2       
[61] survival_2.42-3    yaml_2.2.0         colorspace_1.4-0   classInt_0.2-3     bindr_0.1.1       
[66] haven_1.1.1

Add global option for week_start option

Currently, it's possible for the user to do the following:

library(aweek)
(d <- as.Date("2019-05-03") + seq(0, 54))
#>  [1] "2019-05-03" "2019-05-04" "2019-05-05" "2019-05-06" "2019-05-07"
#>  [6] "2019-05-08" "2019-05-09" "2019-05-10" "2019-05-11" "2019-05-12"
#> [11] "2019-05-13" "2019-05-14" "2019-05-15" "2019-05-16" "2019-05-17"
#> [16] "2019-05-18" "2019-05-19" "2019-05-20" "2019-05-21" "2019-05-22"
#> [21] "2019-05-23" "2019-05-24" "2019-05-25" "2019-05-26" "2019-05-27"
#> [26] "2019-05-28" "2019-05-29" "2019-05-30" "2019-05-31" "2019-06-01"
#> [31] "2019-06-02" "2019-06-03" "2019-06-04" "2019-06-05" "2019-06-06"
#> [36] "2019-06-07" "2019-06-08" "2019-06-09" "2019-06-10" "2019-06-11"
#> [41] "2019-06-12" "2019-06-13" "2019-06-14" "2019-06-15" "2019-06-16"
#> [46] "2019-06-17" "2019-06-18" "2019-06-19" "2019-06-20" "2019-06-21"
#> [51] "2019-06-22" "2019-06-23" "2019-06-24" "2019-06-25" "2019-06-26"
(w <- date2week(d, week_start = "Sunday", floor_day = TRUE))
#> <aweek start: Sunday>
#>  [1] "2019-W18" "2019-W18" "2019-W19" "2019-W19" "2019-W19" "2019-W19"
#>  [7] "2019-W19" "2019-W19" "2019-W19" "2019-W20" "2019-W20" "2019-W20"
#> [13] "2019-W20" "2019-W20" "2019-W20" "2019-W20" "2019-W21" "2019-W21"
#> [19] "2019-W21" "2019-W21" "2019-W21" "2019-W21" "2019-W21" "2019-W22"
#> [25] "2019-W22" "2019-W22" "2019-W22" "2019-W22" "2019-W22" "2019-W22"
#> [31] "2019-W23" "2019-W23" "2019-W23" "2019-W23" "2019-W23" "2019-W23"
#> [37] "2019-W23" "2019-W24" "2019-W24" "2019-W24" "2019-W24" "2019-W24"
#> [43] "2019-W24" "2019-W24" "2019-W25" "2019-W25" "2019-W25" "2019-W25"
#> [49] "2019-W25" "2019-W25" "2019-W25" "2019-W26" "2019-W26" "2019-W26"
#> [55] "2019-W26"
report_week <- "2019-W20-7"
# incorrect: default week starts on monday
length(w[d <= week2date(report_week)])
#> [1] 17
# correct, but requires a lot of fiddling
length(w[d <= week2date(report_week, week_start = "Sunday")])
#> [1] 16

Created on 2019-05-03 by the reprex package (v0.2.1)

This would be prevented by setting the default of week_start to be week_start = getOption("aweek.week_start"). This way, the user can set options(aweek.week_start = "Sunday") at the top of their script to force all week operations to default to the correct week.

Fix incorrect test

Problem

A single test is currently failing on CRAN and will continue to fail for the rest of the year until I fix it:

Version: 1.0.1
Check: tests
Result: ERROR
     Running 'spelling.R' [0s/0s]
     Running 'testthat.R' [2s/3s]
    Running the tests in 'tests/testthat.R' failed.
    Complete output:
     > library(testthat)
     > library(aweek)
     >
     > test_check("aweek")
     == Failed tests ================================================================
     -- Failure (test-make_aweek.R:22:3): get_aweek() will always default to the first weekday of the year --
     `d1` not identical to `d2`.
     Mean relative difference: 0.0003757179
    
     [ FAIL 1 | WARN 0 | SKIP 0 | PASS 193 ]
     Error: Test failures
     Execution halted
Flavor: r-devel-linux-x86_64-debian-clang

Code Causing the Problem

The following code incorrectly assumes that 1 January will land in the first week of the year, which is not the case for 2021 since 2020 had 53 weeks

aweek::date2week("2021-01-01")
#> <aweek start: Monday>
#> [1] "2020-W53-5"

Created on 2021-01-02 by the reprex package (v0.3.0)

test_that("get_aweek() will always default to the first weekday of the year", {
d1 <- as.Date(get_aweek())
d2 <- as.Date(date2week(sprintf("%s-01-01", format(Sys.Date(), "%Y")), floor_day = TRUE))
expect_identical(d1, d2)
})

Solution

  • Fix the test so that it's actually testing for the first week of the year instead of trying to be clever about it
  • add a test that tests for 20 years in the past and the future.
  • submit to CRAN

Failing failure of a test in R-devel

From CRAN:

Dear maintainer,

Pls see
<<https://CRAN.R-project.org/web/checks/check_results_aweek.html>: more
ERRORs for r-devel will follow.

In r-devel, as.POSIXlt now coerces NULL to an empty POSIXlt object, so

R> date2week(NULL)
Error in date2week(NULL) : 
 NULL could not be converted to a date. as.POSIXlt() returned this error:
do not know how to convert 'x' to class β€œPOSIXlt”

no longer fails as before.

Not sure why you provide a test for this in your package. 

In any case: please fix as necessary (drop the test?).

Please correct before 2020-05-12 to safely retain your package on CRAN.

Best
-k

Thanks for the editorial.

Related work

I just found that the lubridate package also provides functions related to isoweek and epiweek.

  • lubridate::isoweek(), lubridate::epiweek(), lubridate::isoyear(), lubridate::epiyear(), and the argument of week_start in functions such as wday()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.