Coder Social home page Coder Social logo

rbernas / disaster_response_pipelines Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 62.15 MB

Project analyzing data from Figure Eight to build a model for an API that classifies disaster messages using a machine learning pipeline that categorizes events so they can be sent to an appropriate disaster relief agency.

Python 79.51% HTML 20.49%

disaster_response_pipelines's Introduction

Disaster_Response_Pipelines

Project analyzing data from Figure Eight to build a model for an API that classifies disaster messages using a machine learning pipeline that categorizes events so they can be sent to an appropriate disaster relief agency.

Installation

There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*.

Project Motivation

The dataset for this project consists of real disaster messages collected by Figure Eight. Each message is categorized into 1 more of 36 categories (e.g., Aid Related, Weather Related, Search And Rescue, etc.) The goal is to assist aid workers in contacting the agency that would most help people in need of help.

File Descriptions

There are 3 main steps for this project, each with an associated file. These steps for the data pipeline are listed below in order:

Data Pipeline

  1. ETL (Extract, Transform, Load) process Loading of the raw dataset, cleaning, and storing into a SQLite database. Associated file: workspace/data/process_data.py

  2. Machine Learning Pipeline The dataset consists of ~26,000 disaster messages. 90% of the data is used for the training set, 10% for the test set. A machine learning pipeline is created using NLTK, scikit-learn's Pipeline and GridSearchCV (to optimize hyperparameters) to outut a machine learning model that uses the message column to predict classifications for 36 categories. This model is exported as a pickle file, to be used in the final step. Associated file: workspace/models/train_classifier.py

  3. Flask App Flask is used to displays results in a web app, along with 3 plots showing distributions for the data set in a bar char and pie charts. Associated file: workspace/app/run.py

How to Run

Run the following commands in the project's root directory and in below order to setup the database and ML model.

  1. ETL
    Run python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db

  2. Machine Learning Pipeline
    Run python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl

  3. Flask App
    Run python run.py

  4. Go to http://0.0.0.0:3001/

Licensing, Authors, Acknowledgements

Thank you to Figure Eight for the data and to Udacity for the guidance and help.

disaster_response_pipelines's People

Contributors

rbernas avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.