Coder Social home page Coder Social logo

sktime-tutorial-pydata-global-2022's Introduction

Welcome to the sktime tutorial at PyData Global 2022

This tutorial is about sktime - a unified framework for machine learning with time series. sktime features various time series algorithms and modular tools for sktime is a widely used scikit-learn compatible library for learning with time series.

sktime is easily extensible by anyone, and interoperable with the pydata/numfocus stack.

This sktime tutorial explains basic and advanced sktime pipeline constructs, and the time series transformer which is the main component in all types of pipelines.

Binder

Also recommended:

๐ŸŽฅ general sktime intro tutorial from PyData Global 2021
๐Ÿ“บ youtube video of sktime intro at PyData Global 2021

๐Ÿš€ How to get started

In the tutorial, we will move through notebooks section by section.

You have different options how to run the tutorial notebooks:

  • Run the notebooks in the cloud on Binder - for this you don't have to install anything!
  • Run the notebooks on your machine. Clone this repository, get conda, install the required packages (sktime, seaborn, jupyter) in an environment, and open the notebooks with that environment. For detail instructions, see below. For troubleshooting, see sktime's more detailed installation instructions.
  • or, use python venv, and/or an editable install of this repo as a package. Instructions below.

Please let us know on the pydata slack if you have any issues during the conference, or join the sktime slack to ask for help anytime.

๐Ÿ’ก Description

In time series analysis, often multiple, sometimes repetitive, algorithmic steps are applied to the data. Organising these steps in a clear way to enable flexible deployment on multiple data sets and easily reproduce results. Pipelines offer a solution to this challenge by providing a structure to build flexible sequences of applying time series algorithms. The modular building blocks of pipelines are "transformers" or "transformations" (in the scikit-learn sense) as well as estimators specific to learning tasks, such as forecasters or time series classifiers. The challenge in learning with time series are the many different types of transformations, such as:

  • transformers of a time series to time series, e.g., differencing and detrending
  • transformers of a time series to a row of primitive features/valus in a data frame, e.g., time series summary
  • transformers of a time series to a panel of time series, e.g., bootstrap, sliding window
  • transformers that apply to hierarchical time series, e.g., reconciliation or hierarchical aggregation
  • transformers of a pair of time series to a real number, e.g., time series distances or kernels

sktime provides a framework to distinguish the above, and to use transformers of the various types as components in different types of pipelines, such as:

  • forecasting pipelines, with transformers applied to endogeneous, exogeneous, or output data,
  • time series classification pipelines, with transformers applied to inputs,
  • compositor pipelines for time series distances or parameter estimators,
  • specialized reduction steps consuming different types of transformers and machine learning estimators,
  • and many more.

The design challenge is to formalize transformers in a way that a given type of transformer can be used in multiple types of pipeline, and creating pipelines that can use multipe types of transformers. sktime solves this challenge through the "scientific type" formalism which applies object orientation based typing to the transformers and inputs/outputs. The presentation will also briefly touch on advanced pipelining concepts such as graph pipelines and roadmap items inviting contributions.

๐ŸŽฅ Other Video Tutorials:

๐Ÿ‘‹ How to contribute

If you're interested in contributing to sktime, you can find out more how to get involved here.

Any contributions are welcome, not just code!

Installation instructions for local use

To run the notebooks locally, you will need:

  • a local repository clone
  • a python environment with required packages installed

Cloning the repository

To clone the repository locally:

git clone https://github.com/sktime/sktime-tutorial-pydata-global-2022.git

Using conda env

  1. Create a python virtual environment: conda create -y -n pydata_sktime python=3.9
  2. Install required packages: conda install -y -n pydata_sktime pip sktime seaborn jupyter pmdarima
  3. Activate your environment: conda activate pydata_sktime
  4. If using jupyter: make the environment available in jupyter: python -m ipykernel install --user --name=pydata_sktime

Using python venv

  1. Create a python virtual environment: python -m venv .venv
  2. Activate your environment: source .venv/bin/activate
  3. Install the requirements: pip install sktime seaborn jupyter pmdarima
  4. If using jupyter: make the environment available in jupyter: python -m ipykernel install --user --name=pydata_sktime

sktime-tutorial-pydata-global-2022's People

Contributors

aiwalter avatar benheid avatar fkiraly avatar miraep8 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.