Skip to content

Utilize machine learning models in assessing credit risks for an individual

Notifications You must be signed in to change notification settings

RobC30/Credit_Risk_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Credit Risk Analysis

Overview

The aim of this project is to utilize machine learning models in assessing an individual's credit risk. We are going to use different machine learning methods in determining credit risk and asses which mmodel might suit our analysis best. Here is the list of machine learning models to be used:

- Oversampling (RandomOverSampler) & SMOTE
- Undersampling (ClusterCentroids)
- Combo sampling approach (SMOTEENN)
- BalancedRandomForestClassifer & EasyEnsembleClassifer

For each model, we are determine its balanced accuracy score, precision, recall and F1 score. We will then compare and contrast to see which might be best suitable for credit risk assesment.

Results


RandomOverSampler

Using the RandomOverSampler model, we can attain an balanced accuracy score of 65%.
Other statistics as follows:

  • Precision - 1% (High-Risk) & almost 100% (Low Risk)
  • Recall/Sensitivity - 71% (High-Risk) & 60% (Low-Risk)
  • F1 Score - 2% (High-Risk) & 75% (Low-Risk)


SMOTE

Using the SMOTE model, we can attain an balanced accuracy score of 65%.
Other statistics as follows:

  • Precision - 1% (High- Risk) & almost 100% (Low Risk)
  • Recall/Sensitivity - 71% (High-Risk) & 69% (Low-Risk)
  • F1 Score - 2% (High-Risk) & 82% (Low-Risk)


ClusterCentroids

Using the ClusterCentroids model, we can attain an balanced accuracy score of 54%.
Other statistics as follows:

  • Precision - 1% (High- Risk) & almost 100% (Low Risk)
  • Recall/Sensitivity - 69% (High-Risk) & 40% (Low-Risk)
  • F1 Score - 1% (High-Risk) & 57% (Low-Risk)


SMOTEENN

Using the SMOTEENN model, we can attain an balanced accuracy score of 64%.
Other statistics as follows:

  • Precision - 1% (High- Risk) & almost 100% (Low Risk)
  • Recall/Sensitivity - 72% (High-Risk) & 58% (Low-Risk)
  • F1 Score - 2% (High-Risk) & 73% (Low-Risk)


BalancedRandomForestClassifier

Using the BalancedRandomForestClassifier model, we can attain an balanced accuracy score of 78%.
Other statistics as follows:

  • Precision - 4% (High- Risk) & almost 100% (Low Risk)
  • Recall/Sensitivity - 67% (High-Risk) & 91% (Low-Risk)
  • F1 Score - 7% (High-Risk) & 95% (Low-Risk)


EasyEnsembleClassifier

Using the EasyEnsembleClassifier model, we can attain an balanced accuracy score of 92%.
Other statistics as follows:

  • Precision - 7% (High- Risk) & almost 100% (Low Risk)
  • Recall/Sensitivity - 91% (High-Risk) & 94% (Low-Risk)
  • F1 Score - 14% (High-Risk) & 97% (Low-Risk)

Summary

For our credit risk analysis, our objective is to determine if a loan would be high-risk for a bank. We will mainly look at these 3 statistics, High-Risk recall rate, Low-Risk recall rate & accuracy score. Considering all these, if we are to use the EasyEnsembleClassifer model for our analysis, this will be the best prediction tool for credit risks.

About

Utilize machine learning models in assessing credit risks for an individual

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published