Comments (7)
Upfront I need to clarify that my benchmark is not that fair yet, because I just return a character vector so far, not a POSIXct formatted object. calcUnique doesnt make a difference in my current benchmark since I used a vector of unique dates.
I try to remember to report back here once i did some more rigorous testing with chronos and a better interface
from anytime.
No need to report back here then if you also use unique values.
from anytime.
(Aside: That's gowdawfully formatted code. But that's just me and 25+ years of ESS use.)
I have the feeling that has come up before. Did you check old issues?
Could you also please measure the overhead of computing unique values at those size for vectors that are in fact unique without replicates?
from anytime.
Maybe add a third column using this value:
calcUnique: A logical value with a default value of ‘FALSE’ that tells
the function to perform the ‘anytime()’ or ‘anydate()’
calculation only once for each unique value in the ‘x’
vector. It results in no difference in inputs or outputs, but
can result in a significant speed increases for long vectors
where each timestamp appears more than once. However, it will
result in a slight slow down for input vectors where each
timestamp appears only once.
from anytime.
Only saw #109 and calcUnique
now... Sorry for the duplicated (and already fixed) issue.
Results with calcUnique = TRUE
for future visitors:
from anytime.
@schochastics Please see above -- @etiennebacher did some digging and touches upon an issue that may matter for your benchmarks too. I have the default for unique on 'off' because where I came from (in my former field of high-ish frequency finance) our timestamps tend to indeed be unique (and by now the field is of course more occoupied with nanoseconds resolution so POSIXct is of limited usefulness, that was different when I wrote anytime
). And for dates it is definitely an issue as it is so much easier to clash values.
@etiennebacher We could think about some data.table
alike heuristics here. Maybe if N > someValue, say 10k, we sample 100 and see if we have replication. Or maybe blockwise sample ten blocks of ten? This would require some thinking but you do document that the gain could be substantial. Worth doing as a heuristic?
from anytime.
Worth doing as a heuristic?
Could be, but I'm not an active user of anytime
as I rarely have a usecase for it, so I don't think my opinion matters much here. I was simply intrigued by the benchmarks of @schochastics and explored a bit to see if there were some low-hanging fruits.
from anytime.
Related Issues (20)
- Time is silently scrubbed when using certain string date time formats HOT 3
- Add argument for default MM/DD to add to just YYYY inputs HOT 1
- timedatectl problem on HPC? HOT 2
- Failed to activate service 'org.freedesktop.timedate1' on Google Cloud VM HOT 9
- month year specification HOT 7
- Could anydate support nanotime ? HOT 4
- Returning NA value HOT 3
- Feature requests: more flexibly find date substring in a non-date string; and process additional incomplete date substrings HOT 11
- European vs US date formats HOT 3
- Feature request: function to return which format was recognized. HOT 3
- Anytime errors with length 1 NA HOT 6
- Chinese date format suggestion HOT 3
- just yyyy-mm-dd hh:mm:ss but not AEST suffix HOT 5
- anytime() sometimes returns the wrong date HOT 2
- Inconsistent handling of vectors with unknown values HOT 1
- Trivial conversion to NA HOT 1
- time is always one-hour early HOT 3
- Warn for NA's caused by `anydate()` HOT 12
- `"30081993"` is not understood by `"%d%m%Y"` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from anytime.