This is the complementary repository for BINet, a neural network architecture for multi-perspective anomaly detection and classification in business process event logs. BINet was originally proposed in [3] and then extended in [4]. The repository also contains implementations of all methods mentioned in [3, 4]. Specifically, it also contains the implementations for the DAE method from [1, 2].
All results can be reproduced using the notebooks in the notebooks
directory.
The easiest way to setup an environment is to use Miniconda.
- Install Miniconda (make sure to use a Python 3 version)
- After setting up miniconda you can make use of the
conda
command in your command line (Powershell, CMD, Bash) - We suggest that you set up a dedicated environment for this project by running
conda env create -f environment.yml
- This will setup a virtual conda environment with all necessary dependencies.
- If your device does have a GPU replace
tensorflow
withtensorflow-gpu
in theenvironement.yml
- Depending on your operating system you can activate the virtual environment with
conda activate binet
on Linux and macOS, andactivate binet
on Windows (cmd
only). - If you want to make use of a GPU, you must install the CUDA Toolkit. To install the CUDA Toolkit on your computer refer to the TensorFlow installation guide.
- If you want to quickly install the
april
package, runpip install -e .
inside the root directory. - Now you can start the notebook server by
jupyter notebook notebooks
.
Note: To use the graph plotting methods, you will have to install Graphviz.
To illustrate the findings in [4], this repository contains Jupyter notebooks.
The notebooks are named according to the sections in the paper.
Notebooks with A
in the name contain additional material which is not included in the papers.
The code to reproduce the figures in the paper can be found inside the notebooks.
All necessary files to reproduce the results are also included in the repository.
- Introduction
- Related Work
- Datasets
- 3.1 Example Process
- 3.2 Dataset Information
- 3.A1 Generation Algorithm
- Describes how the likelihood graph generation algorithm works and how it can be used.
- 3.A2 Dataset Generation
- Generates the same data corpus as used in the paper.
- 3.A3 BPIC Datasets
- Adds artifial anomalies to the BPIC datasets. Result will be the same as the ones used in the paper.
- Method
- 4.1 Heuristics
- 4.A1 Training
- Will train and save the anomaly detection models as used in the paper. For non-deterministic anomaly detectors, results might differ from the ones in the paper.
- Evaluation
- 5.1 Best Strategy
- 5.2 Best Heuristic
- 5.3 Overall Evaluation
- 5.A1 Evaluation Script
- Will evaluate all trained models and save the results to a SQLite database.
- 5.A2 Additional Evaluations
- Misc. evaluations, e.g., per perspective, per event attribute, etc.
- 5.A3 ROC
- Analysis of ROC and AUC
- 5.A3 Hyperparameters
- Test of different hyperparameters for BINet and t-STIDE+
- Classifying Anomalies
- 6. Classification
- Produces the heatmap visualization featured in the paper.
Additionally, demonstrates how to use the
plot_heatmap
method.
- Produces the heatmap visualization featured in the paper.
Additionally, demonstrates how to use the
- 6. Classification
- Conclusion
- Nolle, T., Seeliger, A., Mühlhäuser, M.: Unsupervised Anomaly Detection in Noisy Business Process Event Logs Using Denoising Autoencoders, 2016
- Nolle, T., Luettgen, S., Seeliger A., Mühlhäuser, M.: Analyzing Business Process Anomalies Using Autoencoders, 2018
- Nolle, T., Seeliger, A., Mühlhäuser, M.: BINet: Multivariate Business Process Anomaly Detection Using Deep Learning, 2018
- Nolle, T., Luettgen, S., Seeliger, A., Mühlhäuser, M.: BINet: Multi-perspective Business Process Anomaly Classification, 2019
- Nolle, T., Seeliger, A., Thoma, N, Mühlhäuser, M.: DeepAlign: Alignment-based Process Anomaly Correction Using Recurrent Neural Networks, 2020