Coder Social home page Coder Social logo

zgrouping's Introduction

Finding Local Groupings of Time Series, ECML-PKDD 2022

Introduction

This is an anonymous repository of Finding Local Groupings of Time Series, ECML-PKDD 2022.

Dependencies

This application needs the latest version of the following packages (at the time of April 2022) and Python >= 3.7.

numba numpy pyts

Creating local groupings

Z-Grouping is a four-step framework for detecting local groupings of time series, by applying temporal abstraction and semigeometric tile detection.

The algorithm can be run by importing one method from this repository.

from zgrouping.grouping import createGroupings

The algorithm receives the following parameters:

  • matrices: event label matrix (see Step 1). This matrix can be created by applying utils.createChannel.
  • alpha: a purity threshold (see Steps 1, 2).
  • debug: print and debug option.
  • accept: turn on the grouping quality validation function (Step 4).

These parameters are optional - and only available when the accept function is turned on (accept = True):

  • c: a target global grouping (i.e., class label)
  • metas: global grouping distribution (in general, y values of the dataset)
  • eta: quality control score.

The algorithm returns the following values:

  • R: a set of local groupings (Step 2)
  • G: a set of associations (Step 3)

Synthetic dataset generator

Synthetic dataset generator generates a dataset resembling the situation of shared local groupings among global groupings, having the local patterns we would like to retreive.

The generator can be run by importing one method from this repository.

from zgrouping.syntheticGenerator import createSyntheticData

The detailed explanation of the nature of the dataset is available in the supplementary material. This generator recieves the following parameters:

  • c: number of global groupings
  • tc: the number of member instances for each global grouping
  • tl: length of each time series
  • no_outliers: number of outliers
  • outlier_size: outlier size
  • amp: amplitudes
  • lineranges: length of straight lines
  • lineheights: height of straight lines

The function returns the following values:

  • samples: a collection of the generated samples (X), with tc*c rows and l columns.
  • metas: global grouping information for each sample (y - i.e., class labels).

The default value is set to the same ones used in the experiments of the paper.

The synthetic dataset we used is also available in this repository (in datasets directory).

Utilities

The utils module provides useful functions for the algorithm.

  • utils.znorm: apply time series z-normalization.
  • utils.createChannels: create multiple channels for the multi level tiling (Steps 1-2).
  • utils.SAXify: apply symbolic aggregate approximation to any numpy array.

Example

An example of our algorithm and how to use the synthetic generator is available in the Jupyter notebook example.ipynb. This example goes through the whole process of generating local groupings and associations from the numpy dataset generated by the synthetic dataset generator.

Experiment scripts

The experiment code used in the paper is available in this repository. Run experimentScript.py on the console to reproduce the result. The same parameters used in the paper are already applied.

Results

We provide various materials for the reproducibility of the experiment conducted in the paper.

  • The main result of the experiment is available in the main context of this paper.
  • The full result of the experiment is available as a supplementary material, which can be found in the submission form.
  • The script to generate a whole result is provided in this repository (experimentScript.py).
    • The parameters are set to the ones used in the experiment.

Datasets

The public COVID-19 dataset is collected from this repository from Johns Hopkins University: https://github.com/CSSEGISandData/COVID-19

The S&P 500 stock dataset is collected from this repository: https://www.kaggle.com/camnugent/sandp500

We put the original datasets (stock_original.csv, stock_labels_original.csv, covid19original.csv) and the processed ones (covid.pickle, stock.pickle) as well in this repository for reproducibility.

zgrouping's People

Contributors

zedshape avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.