Skip to content

Performs chi2 and mutual information tests on the bank dataset to find the most relevant categorical and numerical attributes.

Notifications You must be signed in to change notification settings

chiragsatish/Data_Preprocessing

Repository files navigation

Data_Preprocessing

Performs chi2 and mutual information tests on the bank dataset to find the most relevant categorical and numerical attributes.

Data Set

The dataset (bank-additional-full.csv) is related to direct marketing campaigns of a Portuguese banking institution. The classification goal is to predict whether a client will subscribe to a term deposit. Obtained from https://archive.ics.uci.edu/ml/datasets.html

File description

chi2test.py: Performs chi2 test on categorical attributes

MutualInformation.py: Performs mutual information test on numerical attributes

barchartplot.py: Plots bar chart for the categorical attribute entered by the user. Plot_Education.html shows the bar chart for the education attribute.

OneHot.py: Converts the categorical attributes into their one-hot representation

Normalization.py: Performs normalization on numerical attributes to ranges [0,1], [-1,0] or [-1,1]

classdistribution.py: Plots bar chart of the class attribute i.e. whether a client will subscribe to a term deposit.

About

Performs chi2 and mutual information tests on the bank dataset to find the most relevant categorical and numerical attributes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published