Charity Donors

Udacity - Machine learning Nano Degree Program : Project-2

Project Overview

This is third project in the series of the projects listed in Udacity- Machine Learning Nano Degree Program.

CharityML is a fictitious charity organization located in the heart of Silicon Valley that was established to provide financial support for people eager to learn machine learning. After nearly 32,000 letters sent to people in the community, CharityML determined that every donation they received came from someone that was making more than $50,000 annually. To expand their potential donor base, CharityML has decided to send letters to residents of California, but to only those most likely to donate to the charity. With nearly 15 million working Californians, CharityML allowed me to build an algorithm to best identify potential donors and reduce overhead cost of sending mail.

My goal was to evaluate and optimize several different supervised learners to determine which algorithm will provide the highest donation yield while also reducing the total number of letters being sent.

Project Highlights

This project is designed to get me acquainted with the many supervised learning algorithms available in sklearn, and to also provide for a method of evaluating just how each model works and performs on a certain type of data. It is important in machine learning to understand exactly when and where a certain algorithm should be used, and when one should be avoided.

Achievements:

Trained and tested 3 different supervised machine learning models to predict the likelihood of donations.
Classifiers used : k-nearest neighbors, Ada Boost, Random Forest
Achieved accuracy and f-score of 84.83 % using Random Forest

Things i have learnt by completing this project:

How to identify when preprocessing is needed, and how to apply it.
How to establish a benchmark for a solution to the problem.
What each of several supervised learning algorithms accomplishes given a specific dataset.
How to investigate whether a candidate solution model is adequate for the problem.

Other Related Projects:

Project 0 : Titanic Survivals Prediction
Project 1 : Boston's Houses Prediction
Project 3 : Creating Cutomer Segments
Project 4 : Smart Cab
Project 5 : ImageNetBot
Project 6 : Stock Price Predictor

Software and Libraries

This project uses the following software and Python libraries:

Python 2.7
NumPy
pandas
scikit-learn (v0.17)
matplotlib
Jupyter Notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Charity Donors

Project Overview

Project Highlights

Other Related Projects:

Software and Libraries

Files

README.md

Latest commit

History

README.md

File metadata and controls

Charity Donors

Project Overview

Project Highlights

Other Related Projects:

Software and Libraries