Coder Social home page Coder Social logo

ml-aviation-safety's Introduction

Test

Machine Learning in Aviation

Authors: Robert Reynoso

Overview

With my background in aeronautics I wanted to apply my new Machine Learning (ML) skills to my passion and industry. Using data provided by the National Transportation Safety Board (NTSB), I wanted to discover if ML can help in increasing safety or provide more insights into the factors that contribute to a safe flight.

Business Problem

Using ML specifically supervised learning. Can we use classification algorithms to classify past aviation accidents as fatal or non-fatal based on features of that accident?

Data

  • In the industry, data can be obtained and requested through the FAA and NTSB.

  • The data set used in this project was provided by the NTSB and pulled from Kaggle.

Methods

  • Data Wrangling & EDA Was conducted using Seaborn, Matplotlib, Pandas, and Numpy.

  • Model Construction

Given the business problem, a binary classification was used. Model creation was used using the python Sci-kit learn library. Using a function to test multiple models. 7 final models were used. KNN, logistic regression, decision tree, random forest, naive bayes, adaboost, and gradient boosting. The best perfoming out of the 7 models was random forest.

  • Preprocessing pipelines

Test-Train-Split was utilized to set aside the test set, with a 20% test size.

GridSearchCV was utilized to perform a 5 fold cross validation over the selected parameters.

  • Performance metrics

Accuracy score was utilized for the models.

Results

graph1

  • These were the 7 vanilla models that were used

  • Final model - Random Forest

final results

  • X Train results:

x-train-confs

  • X Test results:

x-test-confs

Analysis

  • Percentages of fatal vs non-fatal accidents

target

  • Model performance was high due to these features. Further inspection to be done on fatality percentage.

feature-importance

  • Most accidents in this data set. Further inspection into why or narrow modeling into these makes only.

make

Conclusions

  1. Successfully ran 7 vanilla ML models to learn how we can improve aviation safety.

  2. The above models returned acceptable model performance.

  3. Through initial classification modeling, we learned which features are important in classifying a fatal or non-fatal aviation accident.

  4. Although we can classify a accident with good model performance. Further investigation and feature engineering is required on the fatality_percentage feature.

Next Steps

  1. Find or create more data specifically in the aircraft_damage category

  2. Use imputation to replace any unknown data

  3. Productionize model with prediction function

  4. Look into multiclass classification to improve model accuracy

For More Information

Please review our full analysis in our Jupyter Notebooks or our presentation.

For any additional questions, please contact Robert Reynoso & [email protected]

Repository Structure

├── README.md                           <- The top-level README for reviewers of this project
├── Jupyter notebooks                   <- Narrative documentation of analysis in Jupyter notebook
├── ML in Aviation Safety.pdf           <- PDF version of project presentation
├── data                                <- Both sourced externally and generated from code
└── images                              <- Both sourced externally and generated from code

ml-aviation-safety's People

Contributors

mayurasandakalum avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.