Dataset analysis as a final exercise to test the various concepts learned throughout the semester in the subject 'Statistics for DataSc', and its application using python
- Classification of varibales : categorical/nominal, etc.
2.Summary statistic for each attribute in the dataset.
- Removal of NaNs using KNN Imputation and removal of outliers.
4.Bar graphs for each categorical variable, and insights based on them.
5.Histograms for each numerical variable and insights based on them like skewness.
6.Boxplot analysis and insights based on them like IQR.
7.ScatterMatrix to explore the realtionship b/w each and every variable; Pearson Coefficients.
8.Simple Regression Model for the variables specified in the question; R^2 score.