Coder Social home page Coder Social logo

fg-research / inception-time-sagemaker Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 15.08 MB

SageMaker implementation of InceptionTime model for time series classification.

Home Page: https://aws.amazon.com/marketplace/pp/prodview-omz7rumnllmla

Jupyter Notebook 100.00%
amazon-web-services aws classification convolutional-neural-networks deep-learning inception-architecture machine-learning python pytorch sagemaker time-series time-series-classification

inception-time-sagemaker's Introduction

InceptionTime SageMaker Algorithm

The Time Series Classification (Inception) Algorithm from AWS Marketplace performs time series classification with the InceptionTime model. It implements both training and inference from CSV data and supports both CPU and GPU instances. The training and inference Docker images were built by extending the PyTorch 2.1.0 Python 3.10 SageMaker containers. The algorithm can be used for binary, multiclass and multilabel classification of both univariate and multivariate time series.

Model Description

InceptionTime is an ensemble model. Each model in the ensemble has the same architecture and uses the same hyperparameters. The only difference between the models is in the initial values of the weights, which are sampled from the Glorot uniform distribution.

Each model consists of a stack of blocks, where each block includes three convolutional layers with kernel sizes of 10, 20 and 40 and a max pooling layer. The block input is processed by the four layers in parallel, and the four outputs are concatenated before being passed to a batch normalization layer followed by a ReLU activation.

A residual connection is applied between the input time series and the output of the second block, and after that between every three blocks. The residual connection processes the inputs using an additional convolutional layer with a kernel size of 1 followed by a batch normalization layer. The processed inputs are then added to the output, which is transformed by a ReLU activation.

The output of the last block is passed to an average pooling layer which removes the time dimension, and then to a final linear layer.

At inference time, the class probabilities predicted by the different models are averaged in order to obtain a unique predicted probability and, therefore, a unique predicted label, for each class.

InceptionTime architecture (source: doi: 10.1007/s10618-020-00710-y)

Model Resources: [Paper] [Code]

SageMaker Algorithm Description

The algorithm implements the model as described above with no changes. However, the initial values of the weights are not sampled from the Glorot uniform distribution, but are determined using PyTorch's default initialization method.

Training

The training algorithm has two input data channels: training and validation. The training channel is mandatory, while the validation channel is optional.

The training and validation datasets should be provided as CSV files. The column names of the one-hot encoded class labels should start with "y" (e.g. "y1", "y2", ...), while the column names of the time series values should start with "x" (e.g. "x1", "x2", ...).

The CSV file should contain unique sample identifiers in a column named "sample", and unique feature identifiers in a column named "feature". The feature identifiers are used to determine the different dimensions of multivariate time series. When using univariate time series, the feature identifiers can be set to a constant value.

All the time series should have the same length and the same number of dimensions, and should not contain missing values. The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand.

See the sample input files train.csv and valid.csv.

See notebook.ipynb for an example of how to launch a training job.

Distributed Training

The algorithm supports multi-GPU training on a single instance, which is implemented through torch.nn.DataParallel. The algorithm does not support multi-node (or distributed) training across multiple instances.

Incremental Training

The algorithm supports incremental training. The model artifacts generated by a previous training job can be used to continue training the model on the same dataset or to fine-tune the model on a different dataset.

Hyperparameters

The training algorithm takes as input the following hyperparameters:

  • filters: int. The number of filters of each model in the ensemble.
  • depth: int. The number of blocks of each model in the ensemble.
  • models: int. The number of models in the ensemble.
  • lr: float. The learning rate used for training.
  • batch-size: int. The batch size used for training.
  • epochs: int. The number of training epochs.
  • task: str. The type of classification task, either "binary", "multiclass" or "multilabel".

All the hyperparameters are tunable, excluding the type of classification task, which needs to be defined beforehand.

Metrics

The training algorithm logs the following metrics:

  • train_loss: float. Training loss.
  • train_accuracy: float. Training accuracy.

If the validation channel is provided, the training algorithm also logs the following additional metrics:

  • valid_loss: float. Validation loss.
  • valid_accuracy: float. Validation accuracy.

See notebook.ipynb for an example of how to launch a hyperparameter tuning job.

Inference

The inference algorithm takes as input a CSV file containing the time series values. The column names of the time series values should start with "x" (e.g. "x1", "x2", ...).

The CSV file should contain unique sample identifiers in a column named "sample", and unique feature identifiers in a column named "feature". The feature identifiers are used to determine the different dimensions of multivariate time series. When using univariate time series, the feature identifiers can be set to a constant value. The feature identifiers used for inference should match the ones used for training.

All the time series should have the same length and the same number of dimensions, and should not contain missing values. The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand.

See the sample input file test_data.csv.

The inference algorithm outputs the predicted class labels and the predicted class probabilities, which are returned in CSV format.

See the sample output files batch_predictions.csv and real_time_predictions.csv.

See notebook.ipynb for an example of how to launch a batch transform job.

Endpoints

The algorithm supports only real-time inference endpoints. The inference image is too large to be uploaded to a serverless inference endpoint.

See notebook.ipynb for an example of how to deploy the model to an endpoint, invoke the endpoint and process the response.

Additional Resources: [Sample Notebook] [Blog post]

References

  • H. Ismail Fawaz, B. Lucas, G. Forestier, C. Pelletier, D.F. Schmidt, J. Weber, G.I. Webb, L. Idoumghar, P.A. Muller and F. Petitjean, "InceptionTime: Finding AlexNet for Time Series Classification," Data Mining and Knowledge Discovery, vol. 34, no. 6, pp. 1936-1962, 2020, doi: 10.1007/s10618-020-00710-y.

inception-time-sagemaker's People

Contributors

fg-research-dev avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.