Coder Social home page Coder Social logo

alexkyllo / cutimewarp Goto Github PK

View Code? Open in Web Editor NEW
7.0 5.0 0.0 6.86 MB

CUDA Implementations of Dynamic Time Warping and SoftDTW loss function for time series machine learning. Course project for CSS535 HPC

License: MIT License

Makefile 0.53% Cuda 9.94% C++ 14.07% Objective-C 63.06% TeX 10.88% Shell 0.11% Python 1.41%

cutimewarp's Introduction

cuTimeWarp

CUDA C++ implementations of Dynamic Time Warping and SoftDTW loss function for time series machine learning.

Based on algorithms described in:

Building

This project uses a Makefile to coordinate separate compilation of CUDA kernels and C++ code and is tested on Ubuntu Linux. Typing make will list the available commands:

$ make

Available rules:

build               Build binaries
clean               Delete binaries
fmt                 Format the code with clang-format
plot                Run python script to generate plots
report              Compile the PDF report
run                 Run experiments
run_multi           Run multi-distance experiments
test                Build and run unit tests

To compile the kernels and the test programs, use the make build command.

All C++ / CUDA source code is found in the src/ folder.

Library Dependencies

In addition to depending on the CUDA runtime and cuBLAS (tested with CUDA 11.2), the programs link to BLAS for the CPU implementations, so a version of this library such as (e.g. OpenBLAS) must be available on the machine.

Running

The three programs to use for running comparative performance experiments are:

  • bin/soft_dtw_perf_cpu for timing CPU performance
  • bin/soft_dtw_perf_multi for timing GPU performance
  • bin/soft_dtw_perf_tiled for timing the tiled kernel on GPU (for long time series > 1024)

The programs accept as arguments either a filename containing space-delimited data (see data/ECG200/ECG200_ALL.txt) or the word random and a time series length and count. The program will compute the Soft-DTW dissimilarity between all pairs of time series in the batch and then print output in four columns:

  • Kernel function name
  • The input time series length (number of columns per row)
  • The input time series count (number of rows)
  • The execution time in microseconds

Example:

$ ./bin/soft_dtw_perf_multi
Usage: ./bin/soft_dtw_perf_multi [INPUT_FILENAME] | random [length] [count]

$  ./bin/soft_dtw_perf_multi ./data/ECG200/ECG200_ALL.txt
Data file ./data/ECG200/ECG200_ALL.txt contains 200 time series of length 96
sq_euclid_dist_multi 96 200 515037
softdtw_cuda_naive_multi 96 200 264987
softdtw_cuda_naive_multi_bw_80 96 200 235089
softdtw_cuda_naive_multi_bw_60 96 200 168621
softdtw_cuda_naive_multi_bw_40 96 200 83501
softdtw_cuda_naive_multi_bw_20 96 200 51338
softdtw_cuda_stencil_multi 96 200 100990
softdtw_cuda_stencil_multi_80 96 200 100408
softdtw_cuda_stencil_multi_60 96 200 100844
softdtw_cuda_stencil_multi_40 96 200 101215
softdtw_cuda_stencil_multi_40 96 200 100436
softdtw_cuda_stencil_multi_20 96 200 100647
convert_diagonal_multi 96 200 332664
softdtw_cuda_diagonal_multi 96 200 149158

$ ./bin/soft_dtw_perf_multi random 100 100
sq_euclid_dist_multi 100 100 335883
softdtw_cuda_naive_multi 100 100 61576
softdtw_cuda_naive_multi_bw_80 100 100 52272
softdtw_cuda_naive_multi_bw_60 100 100 32211
softdtw_cuda_naive_multi_bw_40 100 100 18919
softdtw_cuda_naive_multi_bw_20 100 100 18725
softdtw_cuda_stencil_multi 100 100 26558
softdtw_cuda_stencil_multi_80 100 100 25803
softdtw_cuda_stencil_multi_60 100 100 31000
softdtw_cuda_stencil_multi_40 100 100 26120
softdtw_cuda_stencil_multi_40 100 100 25804
softdtw_cuda_stencil_multi_20 100 100 30992
convert_diagonal_multi 100 100 87427
softdtw_cuda_diagonal_multi 100 100 43893

TODO List

  • Implement naive DTW on CPU
  • Implement soft DTW on CPU
  • Choose benchmarking datasets
  • Implement pairwise squared Euclidean distance on CPU
  • Implement soft DTW gradient on CPU
  • Implement soft DTW barycenter estimation on CPU
  • Implement naive soft DTW in CUDA
  • Implement pairwise squared Euclidean distance in CUDA
  • Implement soft DTW gradient in CUDA
  • Implement soft DTW barycenter estimation in CUDA
  • Tiling
  • Shared memory stencil
  • Sakoe-Chiba bands
  • Contiguous diagonal-major array storage layout
  • Run benchmark experiments
  • Analysis of experiment results

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.