Skip to content

The goal of this project was to build a predictive model to determine the income level for people in the US.

Notifications You must be signed in to change notification settings

abhilekhdas/Imbalanced-Data-Machine-Learning-project-on-US-census-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Imbalanced-Data-Machine-Learning-project-on-US-census-data

The goal of this project was to build a predictive model to determine the income level for people in the US.

  • It was a binary classification problem and the income levels were binned at below 50K and above 50K.
  • Various operations on the data set were performed like data exploration, data cleaning, and feature engineering to make the data suitable for building the model.
  • The data being highly imbalanced various techniques like undersampling, oversampling and SMOTE was applied to make the data more balanced.
  • The model was trained using methods like Naive Bayes, SVM, and XgBoost

Statistical Software used: R

R packages used: caret, data.table, mlr, dplyr, ggplot2 etc.

About

The goal of this project was to build a predictive model to determine the income level for people in the US.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages