kdd-opensource / deepadots Goto Github PK
View Code? Open in Web Editor NEWRepository of the paper "A Systematic Evaluation of Deep Anomaly Detection Methods for Time Series".
License: MIT License
Repository of the paper "A Systematic Evaluation of Deep Anomaly Detection Methods for Time Series".
License: MIT License
Extend/wrap the data generator such that we can easily vary:
2018-06-13 06:56:47 [ERROR] src.evaluation.evaluator: An exception occured while training Donut on Syn Extreme Outliers (mis=1.0): `std` must be positive
2018-06-13 06:56:47 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "../src/evaluation/evaluator.py", line 67, in evaluate
det.fit(X_train, y_train)
File "../src/algorithms/donut.py", line 183, in fit
trainer.fit(features, labels, missing, mean, std)
File "../src/algorithms/donut.py", line 73, in fit
aug = MissingDataInjection(mean, std, self._missing_data_injection_rate)
File "/home/maxi/.local/lib/python3.6/site-packages/donut/augmentation.py", line 81, in __init__
super(MissingDataInjection, self).__init__(mean, std)
File "/home/maxi/.local/lib/python3.6/site-packages/donut/augmentation.py", line 19, in __init__
raise ValueError('`std` must be positive')
ValueError: `std` must be positive
On current master
2018-06-22 10:41:54 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withoutWindow on Synthetic Combined Outliers: Lapack Error getrf : U(3,3) is 0, U is singular at /
pytorch/aten/src/TH/generic/THTensorLapack.c:514
2018-06-22 10:41:54 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/home/willi/Documents/MP-2018/src/evaluation/evaluator.py", line 71, in evaluate
det.fit(X_train, y_train)
File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 196, in fit
self.dagmm_step(input_data.float())
File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 172, in dagmm_step
self.lambda_cov_diag)
File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 141, in loss_function
sample_energy, cov_diag = self.compute_energy(z, phi, mu, cov)
File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 107, in compute_energy
cov_inverse.append(torch.inverse(cov_k).unsqueeze(0))
RuntimeError: Lapack Error getrf : U(3,3) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514
Currently, we're showing accuracy, precision, recall, ... in the evaluation tables of detectors on a dataset. Additionally, display the ROC-AUC score for an easier comparison of detectors over multiple datasets.
Go go Pytorch magician @xasetl
A specified set of algorithms should be evaluated on given data sets.
Currently sets _data
to a pd.DataFrame
, needs to be tuple of (pd.Dataframe, pd.Series, pd.DataFrame, pd.Series)
for X_train, y_train, X_test, y_train
.
Get inspired by Lukas' implementation
18-06-21 07:12:00 [ERROR] src.evaluation.evaluator: An exception occurred while training Recurrent EBM on Synthetic Variance Outliers: Failed to create session.
2018-06-21 07:12:00 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
det.fit(X_train, y_train)
File "/repo/src/algorithms/rnn_ebm.py", line 43, in fit
self.tf_session = tf.Session()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1560, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 633, in __init__
self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
We want to find out what addition lets DAGMM perform better/worse on which dataset setting. This should help us to explain in the end what exact contribution makes us better.
This is the output of running Donut on the new multivariate outliers (PR #72 ):
2018-06-13 01:05:21 [ERROR] src.evaluation.evaluator: An exception occured while training Donut on Synthetic Multivariate Outliers: The shape of ``arrays[1]`` does not agree with the shape of `timestamp` ((1000, 1) vs (1000,))
2018-06-13 01:05:21 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "../src/evaluation/evaluator.py", line 67, in evaluate
det.fit(X_train, y_train)
File "../src/algorithms/donut.py", line 160, in fit
timestamps, missing, (features, labels) = complete_timestamp(timestamps, (features, labels))
File "/home/maxi/.local/lib/python3.6/site-packages/donut/preprocessing.py", line 36, in complete_timestamp
format(i, array.shape, timestamp.shape))
ValueError: The shape of ``arrays[1]`` does not agree with the shape of `timestamp` ((1000, 1) vs (1000,))
2018-06-21 07:18:44 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withWindow on Synthetic Variance Outliers: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
2018-06-21 07:18:44 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
det.fit(X_train, y_train)
File "/repo/src/algorithms/dagmm.py", line 192, in fit
self.dagmm.cuda()
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 249, in cuda
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 176, in _apply
module._apply(fn)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 176, in _apply
module._apply(fn)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 111, in _apply
ret = super(RNNBase, self)._apply(fn)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 182, in _apply
param.data = fn(param.data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 249, in <lambda>
return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
Having something like
reports/figures/roc_Synthetic\ Extreme\ Outliers-1-1-2018-06-06-081700.pdf
would be easier to interpret and keep track of than
reports/figures/roc_Synthetic\ Extreme\ Outliers-1-1-1528265710.pdf
On the current master:
Traceback (most recent call last):
File "main.py", line 118, in <module>
main()
File "main.py", line 17, in main
run_experiments()
File "main.py", line 88, in run_experiments
steps=1)
File "/repo/experiments.py", line 49, in run_extremes_experiment
evaluator.plot_auroc(title='Area under the curve for differing outlier heights')
File "/repo/src/evaluation/evaluator.py", line 220, in plot_auroc
aurocs = self.benchmark_results[self.benchmark_results['algorithm'] == det.name]['auroc']
TypeError: 'NoneType' object is not subscriptable
See curret master:
2018-06-09 18:24:26 [ERROR] src.evaluation.evaluator: An exception occured while training LSTM-Enc-Dec on Synthetic Extreme Outliers: 'LSTM_Enc_Dec' object has no attribute 'seed'
2018-06-09 18:24:26 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/home/circleci/repo/src/evaluation/evaluator.py", line 46, in evaluate
det.fit(X_train, y_train)
File "/home/circleci/repo/src/algorithms/lstm_enc_dec.py", line 100, in fit
self._fit(train_timeseries_dataset)
File "/home/circleci/repo/src/algorithms/lstm_enc_dec.py", line 193, in _fit
self._save_checkpoint(epoch, self.best_val_loss, means=means, covs=covs)
File "/home/circleci/repo/src/algorithms/lstm_enc_dec.py", line 218, in _save_checkpoint
'seed': self.seed,
AttributeError: 'LSTM_Enc_Dec' object has no attribute 'seed'
2018-06-21 07:11:59 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withWindow on Synthetic Extreme Outliers: CUDNN_STATUS_EXECUTION_FAILED
2018-06-21 07:11:59 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
det.fit(X_train, y_train)
File "/repo/src/algorithms/dagmm.py", line 199, in fit
self.dagmm_step(input_data.float())
File "/repo/src/algorithms/dagmm.py", line 169, in dagmm_step
enc, dec, z, gamma = self.dagmm(input_data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/repo/src/algorithms/dagmm.py", line 48, in forward
dec, enc = self.autoencoder(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/repo/src/algorithms/autoencoder.py", line 77, in forward
_, enc_hidden = self.encoder(ts_batch.float(), enc_hidden) # .float() here or .double() for the model
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 192, in forward
output, hidden = func(input, self.all_weights, hx, batch_sizes)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 323, in forward
return func(input, *fargs, **fkwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 287, in forward
dropout_ts)
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED
Verify that the implemented model works like the implementation from the paper.
Traceback (most recent call last):
File "main.py", line 123, in <module>
main()
File "main.py", line 17, in main
run_experiments()
File "main.py", line 93, in run_experiments
detectors = [RecurrentEBM(num_epochs=15), LSTMAD(), Donut(), LSTM_Enc_Dec(num_epochs=15),
File "/repo/src/algorithms/lstm_ad.py", line 53, in __init__
torch.manual_seed(0)
File "/usr/local/lib/python3.6/dist-packages/torch/random.py", line 33, in manual_seed
torch.cuda.manual_seed_all(seed)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/random.py", line 86, in manual_seed_all
_lazy_call(lambda: _C._cuda_manualSeedAll(seed))
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py", line 121, in _lazy_call
callable()
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/random.py", line 86, in <lambda>
_lazy_call(lambda: _C._cuda_manualSeedAll(seed))
RuntimeError: Creating MTGP constants failed. at /pytorch/aten/src/THC/THCTensorRandom.cu:34
See current master
File "main.py", line 101, in <module>
main()
File "main.py", line 12, in main
run_pipeline()
File "main.py", line 41, in run_pipeline
evaluator.evaluate()
File "/home/circleci/repo/src/evaluation/evaluator.py", line 42, in evaluate
(X_train, y_train, X_test, y_test) = ds.data()
File "/home/circleci/repo/src/datasets/dataset.py", line 29, in data
self.load()
File "/home/circleci/repo/src/datasets/synthetic_dataset.py", line 42, in load
y_test = self._label_outliers(self.outlier_config)[train_split_point:]
File "/home/circleci/repo/src/datasets/synthetic_dataset.py", line 50, in _label_outliers
for ts in outlier['timestamps']:
KeyError: 'timestamps'
2018-06-13 06:56:46 [ERROR] root: Couldn't take the inverse of cov. Maybe singular?
2018-06-13 06:56:46 [ERROR] src.evaluation.evaluator: An exception occured while training LSTM-Enc-Dec on Syn Extreme Outliers (mis=1.0): Lapack Error getrf : U(5,5) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514
2018-06-13 06:56:46 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "../third_party/lstm_enc_dec/anomalyDetector.py", line 84, in anomalyScore
mult2 = torch.inverse(cov) # [ prediction_window_size * prediction_window_size ]
RuntimeError: Lapack Error getrf : U(2,2) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "../src/evaluation/evaluator.py", line 68, in evaluate
score = det.predict(X_test)
File "../src/algorithms/lstm_enc_dec.py", line 109, in predict
channels_scores = self.predict_channel_scores(X_test)
File "../src/algorithms/lstm_enc_dec.py", line 105, in predict_channel_scores
channels_scores, _ = self._predict(test_timeseries_dataset)
File "../src/algorithms/lstm_enc_dec.py", line 215, in _predict
self.data, self.filename)
File "../third_party/lstm_enc_dec/anomaly_detection.py", line 88, in calc_anomalies
score_predictor=score_predictor, channel_idx=channel_idx,
File "../third_party/lstm_enc_dec/anomalyDetector.py", line 91, in anomalyScore
mult2 = torch.inverse(cov) # [ prediction_window_size * prediction_window_size ]
RuntimeError: Lapack Error getrf : U(5,5) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514
2018-06-21 07:50:55 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withWindow on Syn Extreme Outliers (pol=0.0): parameter types mismatch
2018-06-21 07:50:55 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
det.fit(X_train, y_train)
File "/repo/src/algorithms/dagmm.py", line 199, in fit
self.dagmm_step(input_data.float())
File "/repo/src/algorithms/dagmm.py", line 169, in dagmm_step
enc, dec, z, gamma = self.dagmm(input_data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/repo/src/algorithms/dagmm.py", line 48, in forward
dec, enc = self.autoencoder(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/repo/src/algorithms/autoencoder.py", line 77, in forward
_, enc_hidden = self.encoder(ts_batch.float(), enc_hidden) # .float() here or .double() for the model
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 192, in forward
output, hidden = func(input, self.all_weights, hx, batch_sizes)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 323, in forward
return func(input, *fargs, **fkwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 287, in forward
dropout_ts)
RuntimeError: parameter types mismatch
As discussed in the previous meeting, it might be possible to discard data from training based on certain entropy values to increase the robustness of the algorithm to noise in the training data.
Implement the LSTM Encoder-Decoder and verify that the built model works like the implementation from the paper.
Verify that the implemented model works like the implementation from the paper.
2018-06-28 11:03:43 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/repo/src/evaluation/evaluator.py", line 72, in evaluate
score = det.predict(X_test)
File "/repo/src/algorithms/lstm_ad.py", line 99, in predict
scores = -multivariate_normal.logpdf(norm, mean=self.mean, cov=self.cov, allow_singular=True)
File "/usr/local/lib/python3.6/dist-packages/scipy/stats/_multivariate.py", line 487, in logpdf
psd = _PSD(cov, allow_singular=allow_singular)
File "/usr/local/lib/python3.6/dist-packages/scipy/stats/_multivariate.py", line 152, in __init__
s, u = scipy.linalg.eigh(M, lower=lower, check_finite=check_finite)
File "/usr/local/lib/python3.6/dist-packages/scipy/linalg/decomp.py", line 374, in eigh
a1 = _asarray_validated(a, check_finite=check_finite)
File "/usr/local/lib/python3.6/dist-packages/scipy/_lib/_util.py", line 238, in _asarray_validated
a = toarray(a)
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py", line 1233, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
print()
via flake82018-06-21 07:58:28 [INFO] src.evaluation.evaluator: Training DAGMM_NNAutoEncoder_withWindow on Syn Extreme Outliers (pol=0.25)
2018-06-21 07:58:28 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_NNAutoEncoder_withWindow on Syn Extreme Outliers (pol=0.25): Expected object of type torch.FloatTensor but found t
ype torch.cuda.FloatTensor for argument #4 'mat1'
2018-06-21 07:58:28 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
det.fit(X_train, y_train)
File "/repo/src/algorithms/dagmm.py", line 199, in fit
self.dagmm_step(input_data.float())
File "/repo/src/algorithms/dagmm.py", line 169, in dagmm_step
enc, dec, z, gamma = self.dagmm(input_data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/repo/src/algorithms/dagmm.py", line 48, in forward
dec, enc = self.autoencoder(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/repo/src/algorithms/autoencoder.py", line 41, in forward
enc = self._encoder(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py", line 55, in forward
return F.linear(input, self.weight, self.bias)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 992, in linear
return torch.addmm(bias, input, weight.t())
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'mat1'
Verify that the implemented model works like the implementation from the paper.
(venv3) ➜ MP-2018 git:(master) python main.py
Traceback (most recent call last):
File "main.py", line 5, in <module>
from src.datasets.synthetic_data_generator import SyntheticData
ImportError: cannot import name 'SyntheticData'
Verify that the implemented model works like the implementation from the paper.
/home/maxi/.local/lib/python3.6/site-packages/numpy/linalg/linalg.py:1874: RuntimeWarning: invalid value encountered in det
r = _umath_linalg.det(a, signature=signature)
2018-06-13 06:23:52 [ERROR] src.evaluation.evaluator: An exception occured while training DAGMM on Syn Extreme Outliers (mis=0.75): Threshold is NaN
2018-06-13 06:23:52 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
File "../src/evaluation/evaluator.py", line 68, in evaluate
score = det.predict(X_test)
File "../src/algorithms/dagmm.py", line 262, in predict
raise Exception("Threshold is NaN")
Exception: Threshold is NaN
/home/maxi/.local/lib/python3.6/site-packages/numpy/lib/function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile
interpolation=interpolation)
Generate proper tables and plots from the evaluation results.
While training Donut on Synthetic Shift Outliers an exception occured:gradient for model/donut/p_x_given_z/mean/dense/bias:0 has numeric issue : Tensor had NaN values [[Node: quiet_donut_trainer_9/CheckNumerics_13 = CheckNumerics[T=DT_FLOAT, message="gradient for model/donut/p_x_given_z/mean/dense/bias:0 has numeric issue", _device="/job:localhost/replica:0/task:0/device:CPU:0"](quiet_donut_trainer_9/clip_by_norm_13/truediv)]]
Since the algorithm only supports univariate datasets, apply it independently to each feature and aggregate the anomaly scores using the maximum.
High-Dimensional Data on agots Types
Missing on other agots Types
High-Dimensional Multivariate
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.