disaster_response_pipeline

DSND Term 2 Project: Disaster Response Pipeline

Project Overview

Build a Natural Language Processing (NLP) model that classifies messages during disaster event (e.g. an earthquake or hurricane). The model is capable of classifying messages into several categories so that the messages can be directed to the appropriate aid agencies to act.

File Descriptions

.
├── README.md
├── app
│   ├── run.py
│   └── templates
│       ├── go.html
│       └── master.html
├── data
│   ├── DisasterResponse.db
│   ├── disaster_categories.csv
│   ├── disaster_messages.csv
│   └── process_data.py
├── models
│   ├── disaster_response_prediction.pkl
│   └── train_classifier.py
└── notebook
    ├── ETL Pipeline Preparation.ipynb
    └── ML Pipeline Preparation.ipynb

ETL Pipeline Data cleaning pipeline contained in data/process_data.py:
- Loads the messages and categories datasets
- Merges the two datasets
- Cleans the data
- Stores it in a SQLite database
ML Pipeline Machine learning pipeline contained in model/train_classifier.py:
- Loads data from the SQLite database
- Splits the dataset into training and test sets
- Builds a text processing and machine learning pipeline
- Trains and tunes a model using GridSearchCV
- Outputs results on the test set
- Exports the final model as a pickle file
Flask Web App is the web application to classify messages into respective categories. The web app also displays visualizations of the data.

Instructions

Dependencies

Python 3.6+
Machine Learning Libraries: NumPy, SciPy, Pandas, Sciki-Learn
Natural Language Process Libraries: NLTK
SQLlite Database Libraqries: SQLalchemy
Model Loading and Saving Library: Joblib
Web App and Data Visualization: Flask, Plotly

Executing Program:

Execute the following commands in the project's directory to set up the database, train model and save the model.
- Run ETL pipeline to clean data and store the processed data in the database. python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/ DisasterResponse.db
- Run the ML pipeline that loads data from DB, trains classifier and saves the classifier as a pickle file. python models/train_classifier.py data/DisasterResponse.db models/disaster_response_prediction.pkl
- Run the following command in the app's directory to run your web app. python run.py
Go to http://0.0.0.0:3001/

Additional Materials

In the notebooks folder you can find two jupyter notebook that demonstrated how the model is built step by step:

ETL Preparation Notebook: ETL pipeline implementation
ML Pipeline Preparation Notebook: Machine Learning Pipeline developed with NLTK and Scikit-Learn

Acknowledgements

Credit to Figure Eight for the message data.

Result

Main page, including message input box and two overview charts for whole data set.
Input the message text and click Classify Message, we see the categories the message belongs to highlighted in green.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

disaster_response_pipeline

Project Overview

Table of Contents

File Descriptions

Instructions

Dependencies

Executing Program:

Additional Materials

Acknowledgements

Result

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
app		app
data		data
models		models
notebooks		notebooks
screenshots		screenshots
.gitattributes		.gitattributes
README.md		README.md

andypwyu/disaster_response_pipeline

Folders and files

Latest commit

History

Repository files navigation

disaster_response_pipeline

Project Overview

Table of Contents

File Descriptions

Instructions

Dependencies

Executing Program:

Additional Materials

Acknowledgements

Result

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages