Skip to content

Using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) on same dataset and analyzing the best one

Notifications You must be signed in to change notification settings

sudarshan-koirala/Dimensionality-Reduction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Dimensionality-Reduction

Using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) on same dataset and analyzing the best one. Both PCA and LDA are the feature extraction model.

  • PCA extracts p ≤ m new independent variables from m independent variables that explains the most variance of the dataset regardless of the dependent variable (unsupervised model)

  • LDA extracts p ≤ m new independent variables from m independent variables that separate the most classes of dependent variable (supervised model)

Business Problem

Which wine to recommend to which segment of customers ?

Dataset

The Wine dataset consists 178 rows and 13 dependent variables which gives the different chemical components in one specific wine. The dependent variable consists 3 different clusters(1, 2, 3) of customers who preferred specific wines. Initial dataset link. Modifications is done in the datasets present in this project.

Solution

Logistic regression is used for this problem. As, the wine have 13 different features, dimensionality reduction is used to find 2 features that have the most variance. The confusion matrix and the classification report after applying PCA is shown below.

cm_pca

pca_classification_report

Similarly, the confusion matrix and the classification report after applying LDA is shown below.

cm_lda

lda_classification_report

It shows that the logistic regression classifies the data very accurately as it has 97% accuracy with PCA and 100% accuracy with LDA. Without any other parameter tuning, just with the above result it shows that LDA is best than PCA for this data.

Releases

No releases published

Packages

No packages published

Languages