Coder Social home page Coder Social logo

overtime's Introduction

overtime

Installation

devtools::install_github("clarkjoe/overtime", dependencies = TRUE)

Getting Started

The intent of overtime is to help machine learning developers generate lots of summary statistics extremely quickly. While there are many default summary statistics initially processed, including more will be available in future releases. The default summary statistics are:

  1. sum
  2. mean
  3. median
  4. sd
  5. max
  6. min
  7. sd / mean (coefficient of variation)
  8. OO2
  9. OO3
  10. Largest positive sequence
  11. Largest negative sequence
  12. Largest zero sequence
  13. Largest increasing sequence
  14. Largest decreasing sequence
  15. Largest increasing positive sequence
  16. Largest decreasing positive sequence
  17. Largest increasing negative sequence
  18. Largest decreasing negative sequence

The actual content of the package is smaller than most, but its scope of usability is a wide net. Below is a simple example:

library(overtime)
library(tidyverse)
library(magrittr)

data <- readRDS('../data/rawData.rds')

nestedData <- data %>%
  overtime_by("day") %>%
  overtime_get()

unnestedData <- nestedData %>%
  overtime_unnest()

BOOM! It's that simple.


What it looks like

nestedData

AccountNumber D_Cognostics
A tibble [1 x 18]
B tibble [1 x 18]
C tibble [1 x 18]

unnestedData

AccountNumber D_Count D_Count D_Count D_Mean D_Median D_SD D_Max D_Min ...
A 13 13 13 4.33 5 3.06 7 1 ...
B 10 10 10 3.33 4 2.08 5 1 ...
C 0 0 0 0 0 0 0 0 ...

Compatible data format

Currently, only a specific data format works with overtime. Here is an example data format:

AccountNumber Date Count
A 2014-11-01 1
A 2014-11-02 4
B 2014-11-01 0
B 2014-11-01 12
C 2014-11-02 8
C 2014-11-01 47

There must be:

  • Grouped variable
  • Continious dates
    • All group variables must have equal number of dates
    • There can be no date jumps (eg. 2014-11-01 | 2014-11-03)
  • Numeric counts
    • Each cell must be a positive integer (eg. no NA or -1)

Reference

An article that talks to the benefit of data generated by overtime.

Notes

This package is in development. Functionality and documentation is being improved and polished.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.