Coder Social home page Coder Social logo

eddelbuettel / anytime Goto Github PK

View Code? Open in Web Editor NEW
159.0 12.0 18.0 3.03 MB

Anything to POSIXct or Date Converter

Home Page: https://eddelbuettel.github.io/anytime

License: GNU General Public License v2.0

R 66.98% C++ 27.42% Shell 0.71% TeX 4.89%
r posixct date datetime boost conversions cran c-plus-plus-11 rcpp cpp11

anytime's People

Contributors

bobjansen avatar christophsax avatar eddelbuettel avatar nachti avatar russellpierce avatar stephenbfroehlich avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

anytime's Issues

incorrect time returned from yyyymmddhhmmss

I often use a timestamp created from Sys.time() as a human-readable key (either as a char or int to show when scripts were run). Getting this back into POSIXct with anytime would be a nice addition, however, it doesn't quite get the time stamp correct:

t <- gsub("-|:| ", "", Sys.time())
> t
[1] "20160913122601"
> anytime(t, tz = "Australia/Melbourne")
[1] "2016-09-13 22:01:00 AEST"  ## the time appears incorrect

Is this something you think should be added?

New format

R> anytime:::testFormat("%a %d %b %Y %H:%M:%S %q", "Sat, 22 Oct 2016 09:06:43 -0400")
[1] "2016-10-22 09:06:43 CDT"
R>

The string Sat, 22 Oct 2016 09:06:43 -0400 is a common email header time format. If I recall correctly this is the RFC822 format (with the 2-digits year expanded to 4-digits). Would be good to have it.

Timezone offset is not parsed for input though per the Boost Date_time documentation.

utcdate and anydate don't seem to play well with factors (OSX)

I am running R on OSX within emacs or from terminal and run into problems like below:

as.factor(20160101 + 0:2)
### [1] 20160101 20160102 20160103
### Levels: 20160101 20160102 20160103
anytime(as.factor(20160101 + 0:2))
### [1] "2015-12-31 23:00:00 GMT" "2016-01-01 23:00:00 GMT"
### [3] "2016-01-02 23:00:00 GMT"
utctime(as.factor(20160101 + 0:2))
### [1] "2016-01-01 GMT" "2016-01-02 GMT" "2016-01-03 GMT"
utcdate(as.factor(20160101 + 0:2))
### [1] "1400-01-01" "1400-02-01" "1400-03-01"
anydate(as.factor(20160101 + 0:2))
[1] "1400-01-01" "1400-02-01" "1400-03-01"

R session crashes with nonsensical values on Windows

This crashes the R session on Windows:

anytime::anytime(c("2.343423423", "3.435435345"))

It works fine with useR = TRUE:

anytime::anytime(c("2.343423423", "3.435435345"), useR = TRUE)
# [1] NA NA

and also on other platforms.

Using R 3.5 on Windows 7, with latest anytime.

Condition handling for NA values

Hello,

Are there any plans to change condition handling for NA values so that their presence in a vector simply throws a warning versus an error? For instance...

anydate(c("20010101", "02/02/1902", NA))

... gives me the error:

Error in eval(substitute(expr), envir, enclos) : Inadmissable input: NA

I have gotten around this issue of parsing date formats while accounting for the presence of NA values by using lubridate::parse_date_time(), but think that adding such functionality to your package, combined with its parsing flexibility (and the keystrokes saved by not having to specify the orders parameter in parse_date_time()) would make it an extremely attractive option for daily use.

Thank you,
Mike

Milliseconds are not parsed for certain formats

When I run (from a fresh install from CRAN and when running on my version of master)

options(digits.secs = 6)
dput(anytime(c("20160901 101112.345678", "01Sep2016 101112.345678")))
structure(c(1472688672.34568, 1472688672), 
          class = c("POSIXct","POSIXt"), tzone = "Australia/NSW")

the milliseconds seem not to be parsed. Something that doesn't seem to happen in tests/allFormats.Rout.save.

anydate transforms to previous day

anydate transforms date to previous day, while anytime correctly transforms the dates:

> anydate(20150101)
[1] "2014-12-31"
> anydate("2015/01/01")
[1] "2014-12-31"
> anytime(20150101)
[1] "2015-01-01 CET"
> anytime("2015/01/01")
[1] "2015-01-01 CET"

Parsing single digit months

I wanted to answer this question on StackOverFlow, with "anytime".

This are the dates the have, the format is: "single digit month, day, full year"

2/10/2016  
4/4/2016  
5/8/2016  
10/1/2016

However, anydate() only works on the first entry, not the rest. In the PDF at CRAN, we find:

Issues
The Boost Date_Time library cannot parse single digit months or days. So while ‘2016/09/02’
works (as expected), ‘2016/9/2’ will not. Other non-standard formats may also fail.
The is a known issue (discussed at length in issue tick 5) where Australian times are off by an hour.
This seems to affect only Windows, not Linux.

So, apparently this is a known bug/issue. Are there anywork arounds?

Or we should just use as.POSIXct(df$final_date, format = "%d/%m/%Y") ?

anydate("2/10/2016")
[1] "2016-02-10"
anydate("4/4/2016")
[1] NA
anydate("5/8/2016")
[1] NA
anydate("10/1/2016")
[1] NA

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Spanish_Peru.1252  LC_CTYPE=Spanish_Peru.1252   
[3] LC_MONETARY=Spanish_Peru.1252 LC_NUMERIC=C                 
[5] LC_TIME=Spanish_Peru.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] anytime_0.0.4 dplyr_0.5.0  

loaded via a namespace (and not attached):
[1] lazyeval_0.2.0  magrittr_1.5    R6_2.1.3       
[4] assertthat_0.1  rsconnect_0.4.3 DBI_0.5        
[7] tools_3.3.1     tibble_1.2      Rcpp_0.12.7

anydate sometimes yields NAs

Hi Dirk, for some input values I am getting NAs. Is this a bug?

Might be related to bug#33 which I reported before but was fixed SOv report

library(anytime)  # V 0.3.0
a <- c("3/22/2013 0:00", "3/21/2012 0:00", "2/19/2014 0:00", "12/5/2013 0:00", "5/8/2013 0:00", "10/15/2010 0:00")
anydate(a)
# [1] "2013-03-22" "2012-03-21" "2014-02-19" NA           NA          
# [6] "2010-10-15"

sessionInfo()

R version 3.3.2 (2016-10-31)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base

other attached packages:
[1] anytime_0.3.0

loaded via a namespace (and not attached):
[1] tools_3.3.2 Rcpp_0.12.11 RApiDatetime_0.0.3

Allow format strings to be added

Would allow to overcome the currently fixed set for odd formats like

  • %y for two-digit year, or even
  • %I %p for 12 hour time and am/pm,
  • any other format we don't currently have.

Thanks & Get back index into format array that matched each date/time?

Derk, This is fantastic. I all too frequently get files with inconsistent date/time formats in a column (thanks mostly to Excel!). One feature request: When doing data quality audits, it would be useful to show the frequency of each format in the input stream which would be trivial if there was an option to return the index of format rather than the date/time. -Jim

POSIXct to Date conversion could use timezone

anydate() seems to be missing the timezone argument to as.Date() internally ...

library(anytime)

tzone <- "America/New_York"
x <- as.POSIXct("2017-01-01 19:00",tzone)
anydate(x,tzone)
# "2017-01-02"

utcdate() should also be changed

Thanks for the awesome package!

Occasionally producing inconsistent results

Occasionally anydate() produces inconsistent results with integer representing dates in yyyyMMdd format. But it's not easy to constantly reproduce.

> anytime::anydate(20160101L)
[1] "1400-01-01"

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 24 (Server Edition)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.3.3        Rcpp_0.12.11       RApiDatetime_0.0.3 anytime_0.3.0     

another date format - day missing preceding zeros

I'd like the package to handle this, if possible

library(anytime)
library(stringr)
library(dplyr)
 df <- data.frame(date=c("Apr 3, 2016","Apr 13, 2016"),stringsAsFactors = FALSE)

As expected

df_new <-df %>% 
  mutate(newdate=anydate(date))
glimpse(df_new)

Observations: 2
Variables: 2
$ date    <chr> "Apr 3, 2016", "Apr 13, 2016"
$ newdate <date> NA, 2016-04-13

I started doing a hack (which would input a 0 if the ", "was in a particular position but at first stage had the newdate returned as a double

df_new <-df %>% 
  mutate(newdate=ifelse(str_sub(date,6,6)==",",anydate(Sys.Date()),anydate(date)))
glimpse(df_new)

Observations: 2
Variables: 2
$ date    <chr> "Apr 3, 2016", "Apr 13, 2016"
$ newdate <dbl> 17288, 16904

Counter intuitive result when using anydate, is it a config problem I have ?

I am trying to use anytime those days and I see results that seems counter intuitive to me.
I have the following:

anytime:::getTZ()
[1] "Europe/London"
anytime('2016-05-12 15:00:00')
[1] "2016-05-12 14:00:00 BST" # which I would have expected to be "2016-05-12 15:00:00 BST"
anytime('2016-12-12 15:00:00')
[1] "2016-12-12 14:00:00 GMT" # which I would have expected to be "2016-05-12 15:00:00 GMT"

Reading the vignette I understand that from
function (x, tz = getTZ(), asUTC = FALSE) that tz is NOT the timezone anytime uses to parse the data, which defaults to local (or UTC if asUTC is TRUE) but for "display". This makes sense as when you read data (from a string) you probably assume that the data is in your timezone.

When I use anydate I then get

anydate('2016-12-12')         
[1] "2016-12-11"
anydate(as.Date('2016-12-12')) # as as.Date('2016-12-12') is already a date 
[1] "2016-12-12"

Is this the expected behaviour ?
I am not sure what I miss here.

Boost / R conversion differences

In a comment to #36, @statquant shows some useful R code with datetime conversion between R and Boost.

It shows some residual differences for a fraction of the inputs, and we need to drill down where it stems from. In commit d5e3417 and a4fd956 we add a little helper script which converts numeric time using Boost to a string and back. For core years (from 1902 onwards) this works without discrepancy. We should expand from here to get to the bottom of the other differences.

Another format?

A common format is "Thu Jan 17 09:29:10 EST 2013" but the %Z conversion only works on output.
Hence:

R> library(anytime)
R> anytime("Thu Jan 17 09:29:10 EST 2013")
[1] NA
R> anytime:::testFormat("%a %b %d %H:%M:%S %z %Y", "Thu Jan 17 09:29:10 EST 2013")
[1] NA
R> anytime:::testFormat("%a %b %d %H:%M:%S %Z %Y", "Thu Jan 17 09:29:10 EST 2013")
[1] NA

One can substitute the timezone out:

R> anytime("Thu Jan 17 09:29:10  2013")
[1] "2013-01-17 09:29:10 CST"
R> 

but that is hackish.

A better trick seems to be to just block the three letters with some text, here xxx:

R> anytime:::testFormat("%a %b %d %H:%M:%S xxx %Y", "Thu Jan 17 09:29:10 EST 2013")
[1] "2013-01-17 09:29:10 CST"
R> 

That is probably more useful and worth adding.

Parsing ISO 8601 compatible standard timestamp format (yyyy-mm-ddThh:mm:ss+-ZONE)

Is there any possibility to support ISO 8601 compatible standard timestamp format (yyyy-mm-ddThh:mm:ss+-ZONE) with timezone offset?

library(anytime)
Sys.setenv(TZ=anytime:::getTZ())      ## helper function to try to get TZ
anytime("2016-12-13T17:09:48+01:00")

> anytime("2016-12-13T17:09:48+01:00")
[1] "2016-12-12 23:00:00 UTC"

Thanks,
Dusan

Wrongfully recognition with %m/%d/%y

I want to convert strings that are formatted as such 01/15/17 (January 15th 2017). But when I add the %m/%d/%y formats, it recognises random strings as date.

library(anytime)
string <- "121013_3_1"
anytime(string) #NA

addFormats(c("%m/%d/%y"))
anytime(string) #1999-12-01 UTC

Interestingly, it seems like many characters are actually ignored.

vector <- c("121013_3_1", "1210111", "12z01z_3_1", "121013_3$1")
all(anytime(vector) == "1999-12-01 UTC") #TRUE

Is there another more specific format I could use?

Europe/London TZ Goes back an Hour

I haven't been able to independently reproduce this issue, however, as I'm sure you are aware anytime is failing on R-devel for some platforms. I didn't see an open issue here, so I thought I'd open one such that I could track it. It could very well be a fault in R-devel as opposed to anytime per-se. However, given that I haven't been able to reproduce the issue, I'm in a tough place in terms of evaluating whether that is so. What are your thoughts/impressions?

Unexpected shift of day when forcing timezone to GMT

The default timezone for my R session is 'Europe/Berlin':

> anytime:::getTZ()
[1] "Europe/Berlin"

However, we strive to store all our data using GMT, so in order to convert the date 20161010 to POSIXct I ran

> anytime(20161010, tz = "GMT")
[1] "2016-10-09 22:00:00 GMT"

which, to my surprise, returned 10pm on the previous day.

Similarly, anydate returns the previous day:

anydate(20161010, tz = "GMT")
[1] "2016-10-09"

Is this intentional?

This is what is stored under the hood:

> dput(anytime(20161010, tz = "GMT"))
structure(1476050400, class = c("POSIXct", "POSIXt"), tzone = "GMT")
> dput(anytime(20161010))
structure(1476050400, class = c("POSIXct", "POSIXt"), tzone = "Europe/Berlin")

Implement a robust testing strategy

Results of calls to anytime depend on local settings which makes it a bit more tricky to write good tests, they can fail in different environments. Ideally we make sure that

  1. Tests run correctly independent of the environment;
  2. Still correctly simulate a wide array of locales inside 1 R session (no stop session, change setting, start session)

anytime is overwriting inputs

This is such a useful package, thanks!

But when I use any of the functions on numeric output, it overwrites the input variable for some reason...

It works fine with integer or character input.

Here's an example:

  library(anytime)

  # this works fine
  x <- "1949-01-01"
  b <- anytime(x)
  x  # i.e., x is not modified
#> [1] "1949-01-01"

  x <- -662688000
  b <- anytime(x)
  x  # x is now equal to b !!!
#> [1] "1949-01-01 01:00:00 CET"

  # If I am feeding an integer, everything is fine again...
  x <- -662688000L
  b <- anytime(x)
  x  
#> [1] -662688000

Crash in RStudio (not in R) when calling testFormat

I'm on:

> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=nl_NL.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=nl_NL.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=nl_NL.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.2.3

If I run

anytime:::testFormat("%m/%d/%Y", "22/3/2016")

RStudio (Version 0.99.903) crashes immediately but R ran from the command line does not.

My first hypothesis is that memory corruption happens in anytime in a way that only causes problems when the rsession is started like RStudio does and will try to confirm.

anytime function changing the input variable

Hi,
The function anytime(x) will change the object x when applied, in the following example (anytime 0.2.2 with R 3.3.3). Is this the normal behavior of the package?

library(anytime)

# as.numeric(as.POSIXct("2017-01-01 12:34:56", tz = "Asia/Shanghai"))
unix_time1 = 1483245296 
unix_time2 = unix_time1
unix_time3 = unix_time1+1-1

time1 = anytime(unix_time1)
print(time1)
print(unix_time1)
print(unix_time2)
print(unix_time3)

My output is (respectively):
"2017-01-01 12:34:56 CST"
"2017-01-01 12:34:56 CST" #I expected 1483245296
"2017-01-01 12:34:56 CST" #I expected 1483245296
1483245296

Apparently anytime() not handling July in "%d-%b-%Y" format

Here is an an interesting issue I had with anytime()
consider this

library(anytime)
dtchar = c( "30-Jun-2014 23:30:00", "30-Jun-2014 23:45:00", "01-Jul-2014 00:00:00", "01-Jul-2014 00:15:00", "01-Jul-2014 00:30:00")

Here is what anytime () gave to me

anytime(dtchar)
 [1] "2014-06-30 18:30:00 EDT" "2014-06-30 18:45:00 EDT"
 [3] NA                        NA                       
 [5] NA

Not sure if this a but but it seems to me that in this format ("%d-%b-%Y %H:%M:%S"), anytime () is returning NA for the month of July
Testing with all the twelve months shows this

dtchar2 = c("01-Jan-2014 00:30:00", "01-Feb-2014 00:30:00","01-Mar-2014 00:30:00",
       "01-Apr-2014 00:30:00","01-May-2014 00:30:00", "30-Jun-2014 23:30:00",
  "30-Jul-2014 23:45:00", "01-Aug-2014 00:00:00", "01-Sep-2014 00:15:00",
 "01-Oct-2014 00:30:00", "01-Nov-2014 00:30:00", 
 "01-Dec-2014 00:30:00")

anytime(dtchar2)
[1] "2014-01-01 00:30:00 EST" "2014-02-01 00:30:00 EST"
 [3] "2014-03-01 00:30:00 EST" "2014-04-01 00:30:00 EDT"
 [5] "2014-05-01 00:30:00 EDT" "2014-06-30 23:30:00 EDT"
 [7] NA                        "2014-08-01 00:00:00 EDT"
 [9] "2014-09-01 00:15:00 EDT" "2014-10-01 00:30:00 EDT"
[11] "2014-11-01 00:30:00 EDT" "2014-12-01 00:30:00 EST"

while the base as.POSIXct() handles it correctly

as.POSIXct(dtchar2, format = "%d-%b-%Y %H:%M:%S")
[1] "2014-01-01 00:30:00 EST" "2014-02-01 00:30:00 EST"
 [3] "2014-03-01 00:30:00 EST" "2014-04-01 00:30:00 EDT"
 [5] "2014-05-01 00:30:00 EDT" "2014-06-30 23:30:00 EDT"
 [7] "2014-07-30 23:45:00 EDT" "2014-08-01 00:00:00 EDT"
 [9] "2014-09-01 00:15:00 EDT" "2014-10-01 00:30:00 EDT"
[11] "2014-11-01 00:30:00 EDT" "2014-12-01 00:30:00 EST"

Issue with %d.%m.%Y format

Hi I am having issue with "%d.%m.%Y" format.

dates<-c("20.04.2018", "19.05.2018")
anydate(dates)

Output

[1] NA NA

I am using Window 10 64 bit machine.

 packageVersion("anytime")
[1] ‘0.3.0’

R version 3.4.3

Thanks!

asUTC=TRUE isn't working as expected

I could be misunderstanding the purpose of the parameter, but I was hoping that passing asUTC = TRUE would result in the first response having a Z at the end indicating that it is in UTC time.

rfc3339(anytime(today()-1, asUTC = TRUE))
[1] "2017-07-15T20:00:00-0400"
rfc3339(anytime(today()-1, asUTC = FALSE))
[1] "2017-07-15T20:00:00-0400"

is it possible to force the assumption of a constant format

I get a variety of different date formats, but every date column is internally consistent. But I may have to convert several dates on 30 million rows. So a data set with 30 million start and end dates requires 60 million conversions, which is time consuming.

I was thinking that, in this use case, once anytime figures out what the date format of the first non-missing date is, it doesn't need to do anymore checking. From there, it can parse all the rest of the dates using the same format.

My feature request is a function parameter to "assume" consistent date formatting, strictly for speed purposes.

Feel free to close if this doesn't seem consistent with the purpose of the package. And thanks for putting out anytime and all your other packages.

new format, possibly

Not a bug. Old USGS river flow files, widely consulted, use the following datetime format:
19881001001500 PDT. Anytime seems unhappy with this format.

ubsan warning at CRAN

See https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-UBSAN/anytime/tests/allFormats.Rout:

> anytime(c("2016-09-01 10:11:12", "2016-09-01 10:11:12.345678"))
/data/gannet/ripley/R/test-clang/BH/include/boost/lexical_cast/detail/converter_lexical_streams.hpp:235:43: runtime error: downcast of address 0x7fffce329628 which does not point to an object of type 'buffer_t' (aka 'basic_unlockedbuf<std::basic_streambuf<char, char_traits<char> >, char>')
0x7fffce329628: note: object is of type 'std::__1::basic_stringbuf<char, std::__1::char_traits<char>, std::__1::allocator<char> >'
 9b 7f 00 00  50 15 42 38 9b 7f 00 00  80 01 46 46 9b 7f 00 00  00 00 00 00 00 00 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'std::__1::basic_stringbuf<char, std::__1::char_traits<char>, std::__1::allocator<char> >'
SUMMARY: AddressSanitizer: undefined-behavior /data/gannet/ripley/R/test-clang/BH/include/boost/lexical_cast/detail/converter_lexical_streams.hpp:235:43 in 
[1] "2016-09-01 09:11:12.000000 BST" "2016-09-01 09:11:12.345678 BST"

Dismissed at first as a Boost issue; is actually related to the SEXP conversion.

Support for Excel day count as input

Excel stores dates internally as the number of days since 1899-12-30, i.e. today is 42664. Importing Excel files doesn't always work perfectly and it can happen that these dates are not recognized as dates by the import functionality. In that scenario, I'd like to convert these numbers using anytime.

Is support for this date format wanted? Could Excel type dates overlap with other formats?

anytime not handling Dutch month abbreviations

Today I ran into an issue with pasing Dutch date strings.

A reproducible example:

# setting the locale to Dutch
Sys.setlocale("LC_TIME", "nl_NL.UTF-8")

Using anydate seems to fall back to UTC:

# Dutch month abbreviation doesn't work
> anydate("2013-mrt-14")
[1] NA

# English one does
> anydate("2013-mar-14")
[1] "2013-03-14"

Specifying the timezone with the tz parameter doesn't work either:

> anydate("2013-mrt-14", tz = 'Europe/Amsterdam')
[1] NA

Any idea what might be wrong?

Parsing time in a TZ other than localtime or UTC?

Have you ever considered expanding asUTC's functionality to allow users to parse time in any timezone they specify?

I regularly work with data that is neither in my local timezone nor in UTC. For example, I might receive data with timestamps in strings that should eventually be formatted as POSIXct with an Eastern timezone. I am physically located in California, and my computer's default timezone is Pacific.

In my standard workflow I would do something like:

> tz <- "America/New_York"
> Sys.setenv(TZ = tz)
> imported_data <- "2016-08-01 01:00:00"
> imported_data <- as.POSIXct(imported_data, format = "%Y-%m-%d %H:%M:%S",  tz = tz)
> imported_data
[1] "2016-08-01 01:00:00 EDT"

I'd love to switch to using anytime (so much faster! so much more intuitive!), but I can't get dates to parse in anything other than UTC (with asUTC = T) or my computer's actual localtime (ie PST, regardless of what I set in my R environment). The output is then adjusted to whatever timezone I've specified after the parsing happens, so the above example returns as "2016-08-01 04:00:00" instead of "2016-08-01 01:00:00". Creating the string after changing the R environment timezone also doesn't fix the issue:

> library(anytime)
> imported_data <- "2016-08-01 01:00:00"
> anytime(imported_data)
[1] "2016-08-01 01:00:00 PDT"
> tz <- "America/New_York"
> anytime:::setTZ(tz)
> anytime(imported_data)
[1] "2016-08-01 04:00:00 EDT"
> imported_data2 <- "2016-08-01 01:00:00"
> anytime(imported_data2)
[1] "2016-08-01 04:00:00 EDT"

In case it helps, I'm running Windows 10. A colleague running Ubuntu 1604 had the same thing happen:

On Ubuntu, R version 3.3.2:
> library(anytime)
Warning message:
In fun(libname, pkgname) : No TZ information found. Falling back to UTC.
> imported_data <- "2016-08-01 01:00:00"
> anytime(imported_data)
[1] "2016-08-01 08:00:00 UTC"
> tz <- "America/New_York"
> anytime:::setTZ(tz)
> anytime(imported_data)
[1] "2016-08-01 04:00:00 EDT"
> imported_data2 <- "2016-08-01 01:00:00"
> anytime(imported_data2)
[1] "2016-08-01 04:00:00 EDT" 

As far as I can tell from reading through some of the closed issues, this is the same problem (wanting to actually PARSE in UTC, as opposed to just convert an already-parsed date to UTC) that led to the creation of the asUTC feature. Would it be possible to get a more flexible implementation that would allow parsing in any specified timezone?

Convert timestamp in milliseconds?

Thanks for a great package. I recently encountered a timestamp in milliseconds (it is strange, I know). Would it be worthwhile to have a function for converting these into POSIXCT or Date objects?

Milliseconds rounding issues

Dirk,

I was glancing through the anytime functionalities and ended up discovering there must be an issue with rounding.

> options(digits.secs=6)
> format(anytime("2016-02-25 17:34:00.376", tz="America/New_York"), "%Y-%m-%d %H:%M:%OS %Z")
[1] "2016-02-25 11:34:00.375999 EST"
> format(anytime("2016-02-25 17:34:00.377", tz="America/New_York"), "%Y-%m-%d %H:%M:%OS %Z")
[1] "2016-02-25 11:34:00.377000 EST"
> format(anytime("2016-02-25 17:34:00.375", tz="America/New_York"), "%Y-%m-%d %H:%M:%OS %Z")
[1] "2016-02-25 11:34:00.375000 EST"
> options(digits.secs=3)
> format(anytime("2016-02-25 17:34:00.376", tz="America/New_York"), "%Y-%m-%d %H:%M:%OS %Z")
[1] "2016-02-25 11:34:00.375 EST"`

[Feature request] Support for Japanese formatted dates

The date format yyyy年mm月dd日 is often used in Japan, which is similar to yyyyYmmMddD format in English.

Anytime handles the English case:

> library(anytime)
> anytime("2016Y12M31D")
[1] "2016-12-31 EST"

It'd be nice if it handled the Japanese case too:

> anytime("2016年12月31日")
[1] NA

anytime(numeric) kills R session

Running below kills R session (RStudio Version 1.0.136.)

library(anytime) #anytime_0.2.1

anytime(41275)

Didn't test it on dev version, could be a duplicate of issue: #56

sessionInfo

R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] anytime_0.2.1

loaded via a namespace (and not attached):
[1] tools_3.3.2  Rcpp_0.12.10

ISO 8601 dates: TZ is ignored

Hi,

I just ran into this issue and was wondering why anytime-0.2.0 truncated the TZ information.
For example, anytime("2011-09-08T18:49:05-07:00") returns [1] "2011-09-08 18:49:05 CEST", which is just wrong.

Wouldn't it be better to

  • support the TZ (compare parsedate for ISO 8601)?
  • at least give a warning in that case?

Thx!

Dates before the epoch cause R to crash

In my data set I had a date of birth that was before January 1, 1970. It crashed Rstudio and R. It took me a long time to find out what was going on.

I don't really know what an epoch is in software but I know that anytime can't handle <1970-01-01

library (anytime)  
anydate("10/20/1970")
anydate("01/01/1970")
anydate("12/01/1969")

The first two work as expected. The last line crashes R

Promblems if NA first in vector

A vector with date and NA (in that order) works as expected:

> anytime::anydate(c(Sys.Date(), NA))
[1] "2016-10-06" NA          

But the reversed order does not:

> anytime::anydate(c(NA, Sys.Date()))
Error in as.POSIXlt.numeric(anytime(x = x, tz = tz)) : 
  'origin' must be supplied

Would it be possible to recognise dates even if proceeded by NA:s?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.