Coder Social home page Coder Social logo

xxl4tomxu98 / nasa-jet-engine-maintenance Goto Github PK

View Code? Open in Web Editor NEW
25.0 4.0 5.0 225.74 MB

ML Approaches for RUL Prediction, Anomaly Detection, Survival Analysis and Failure Classification

License: Apache License 2.0

Jupyter Notebook 99.77% Python 0.23%
fault-detection-classification predictive-maintenance remaining-useful-life-prediction predictive-logistics condition-based-maintenance prognostics-health-management realtime-predictive-analytics anomaly-detection fault-detection embeded-systems

nasa-jet-engine-maintenance's Introduction

Prognostic and Predictive Maintenance (PdM) of NASA Turbofan Jet Engine

Predictive Maintenance techniques are used to determine the condition of an equipment to plan the maintenance/failure ahead of its time. This is very useful as the overall equipment and downtime costs can be reduced significantly.

The objective of this project is to implement various Predictive Maintenance methods and assess the performance of each. Each method can be classified broadly into four categories.

  1. Classification: Predicting the failure of machine in upcoming n days
  2. Regression: Predicting the remaining useful life of a machine (RUL)
  3. Feature Selection: Explaining the causes of abnormal behavior
  4. Anomaly Detection: Unsupervised learning
  5. Survival Analysis: Stoistic modeling of probability hazards or survivals

Data

Data sets consists of multiple multivariate time series. Each time series is from a different engine – i.e., the data can be considered to be from a fleet of engines of the same type. You can find the data here.

The engine is operating normally at the start of each time series, and develops a fault at some point during the series. In the training set, the fault grows in magnitude until system failure. In the test set, the time series ends some time prior to system failure.

Imbalanced class distribution is a common issue in many classification tasks. It is not unusual that the training data contains highly imbalanced positive/negative examples which however reflects the true class distribution in a general population. In predictive maintenance applications, this issue exists: the imbalance of failure events to normal operation events. This issue is due to following two major reasons. First, the failure events usually rarely occurs compared to normal operation state for an in-service asset. Second, there are too few failure events. The business cannot afford to let the asset run-to-failure, as it is at the cost of equipment damage and equipment down time.

The SMOTE module is created based on algorithm "SMOTE: synthetic minority over-sampling technique" [5]. It is used to increase the size of the minority examples in a data set by synthesizing new examples with minority class. SMOTE module has two parameters: "SMOTE percentage" and "Number of nearest neighbors". The parameter "SMOTE percentage" should be in multiples of hundreds (100,200,300,400,…). This is fraction of new minority examples that gets added. For examples, we double our minority class by setting the value to 100, we triple the size of minority class by setting the value to 200, etc. The parameter "Number of nearest neighbors" is used to generate new examples from minority class. Each generated example is an average of the original example and its nearest neighbors from the same class.

EDA

  • Time based feature engineering: mean, max, min, standard deviation, etc.
  • Frequency based feature engineering: absolute, relative and cumulative frequency; skewness and peakedness.
  • Combined time and frequency based: bilinear time-frequency distributions; spectral-entropy

Models for Predictive Maintenance

  • Linear, Piecewise-Linear, Exponential Degradation, Weibull and ARIMA model for RUL Prediction
  • Time series forecasting plus anomaly detection
  • Pattern similarity based forecasting
  • Similarity-based model for RUL Prediction
  • LSTM model for RUL Prediction and binary and multiclass classification
  • RNN(GRU) model for binary and multiclass classification
  • 1D CNN for binary and multiclass classification
  • 1D CNN-RNN for binary and multiclass classification
  • 1D CNN-GRU for binary and multiclass classification
  • 1D CNN-LSTM for binary and multiclass classification
  • 1D CNN-SVM for binary classification
  • Attention in GRU
  • Logistic regression classification
  • K nearest neighbors classification
  • Support vector machines (linear and RBF kernels) classifiers and regressors
  • Naive Bayes and Bayesian Network Models
  • Decision trees classification
  • Ensembles of DT (random forest, xgboost, light gradient boosting machines) classifiers and regressors
  • Autokeras failure prediction
  • Tsfresh, sktime, aeon, scikit-survival, py-survival
  • DTW and Time series clustering
  • Genetic Algorithm
  • Hidden Markov Models
  • Survival Analysis - gradient boosting survival, random survival forest, survival support vector machines, Cox probability harzard
  • Survival Analysis - RNN and Weibull propability distribution deep learning
  • Autoencoder - Anamaly Detection
  • Mulivariant Gaussian Unsupervised Learning
  • Principal Component Analysis for anamaly detection
  • Bayesian Structural Time Series (BSTS)

N-CMAPSS_DL

DL evaluation on N-CMAPSS

Turbo fan engine CMAPSS [1]

Sample creator

Following the below instruction, you can create training/test sample arrays for machine learning model (especially for DL architectures that allow time-windowed data as input) from NASA's N-CMAPSS datafile.
Please download Turbofan Engine Degradation Simulation Data Set-2, so called N-CMAPSS dataset [2], from NASA's prognostic data repository
In the downloaded dataset, dataset DS01 has been used for the application of model-based diagnostics and dataset DS02 has been used for data-driven prognostics. Therefore, we need only dataset DS02.
Please locate "N-CMAPSS_DS02-006.h5"file to /N-CMAPSS folder.
Then, you can get npz files for each of 9 engines by running the python codes below.

python3 sample_creator_unit_auto.py -w 50 -s 1 --test 0 --sampling 10

After that, you should run

python3 sample_creator_unit_auto.py -w 50 -s 1 --test 1 --sampling 10

– w : window length
– s : stride of window
– test : select train or test, if it is zero, then the code extracts samples from the engines used for training. Otherwise, it creates samples from test engines
– sampling : subsampling the data before creating the output array so that we can set assume different sampling rate to mitigate memory issues.

Please note that we used N = 6 units (u = 2, 5, 10, 16, 18 & 20) for training and M = 3 units (u = 11, 14 & 15) for test, same as for the setting used in [3].

The size of the dataset is significantly large and it can cause memory issues by excessive memory use. Considering memory limitation that may occur when you load and create the samples, we set the data type as 'np.float32' to reduce the size of the data while the data type of the original data is 'np.float64'. Based on our experiments, this does not much affect to the performance when you use the data to train a DL network. If you want to change the type, please check 'data_preparation_unit.py' file in /utils folder.

In addition, we offer the data subsampling to handle 'out-of-memory' issues from the given dataset that use the sampling rate of 1Hz. When you set this subsampling input as 10, then it indicates you only take only 1 sample for every 10, the sampling rate is then 0.1Hz.

Finally, you can have 9 npz file in /N-CMAPSS/Samples_whole folder.

Each compressed file contains two arrays with different labels: 'sample' and 'label'. In the case of the test units, 'label' indicates the ground truth RUL of the test units for evaluation.

For instance, one of the created file, Unit2_win50_str1_smp10.npz, its filename indicates that the file consists of a collection of the sliced time series by time window size 50 from the trajectory of engine (unit) 2 with the sampling rate of 0.1Hz.

Load created samples

At first, you should load each of the npy files created in /Samples_whole folder. Then, the samples from the different engines should be aggregated.

def load_part_array_merge (npz_units):
    sample_array_lst = []
    label_array_lst = []
    for npz_unit in npz_units:
        loaded = np.load(npz_unit)
        sample_array_lst.append(loaded['sample'])
        label_array_lst.append(loaded['label'])
    sample_array = np.dstack(sample_array_lst)
    label_array = np.concatenate(label_array_lst)
    sample_array = sample_array.transpose(2, 0, 1)
    return sample_array, label_array

The shape of your sample array should be (# of samples from all the units, window size, # of variables)

References

[1] Frederick, Dean & DeCastro, Jonathan & Litt, Jonathan. (2007). User's Guide for the Commercial Modular Aero-Propulsion System Simulation (C-MAPSS). NASA Technical Manuscript. 2007–215026.

[2] Chao, Manuel Arias, Chetan Kulkarni, Kai Goebel, and Olga Fink. "Aircraft Engine Run-to-Failure Dataset under Real Flight Conditions for Prognostics and Diagnostics." Data. 2021; 6(1):5. https://doi.org/10.3390/data6010005

[3] Chao, Manuel Arias, Chetan Kulkarni, Kai Goebel, and Olga Fink. "Fusing physics-based and deep learning models for prognostics." Reliability Engineering & System Safety 217 (2022): 107961.

[4] Mo, Hyunho, and Giovanni Iacca. "Multi-Objective Optimization of Extreme Learning Machine for Remaining Useful Life Prediction." EvoApplications, part of EvoStar 2022 (2022), to appear.

https://www.mathworks.com/help/predmaint/ug/rul-estimation-using-rul-estimator-models.html

[5] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16(1), 321-357.

nasa-jet-engine-maintenance's People

Contributors

xxl4tomxu98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.