Skip to content

Shag0r/Rainfall-Prediction-model

Repository files navigation

Rainfall Prediction Model With Machine Learning

This project focuses on the development of machine learning models for rainfall prediction in major cities across Australia. image

Project Objective: Develop machine learning models for rainfall prediction in major cities to enhance timely forecasting and reduce human and financial losses from extreme weather events.

Data and Methods: Utilize diverse weather data, including temperature, humidity, wind speed, atmospheric pressure, and historical precipitation records, to train and evaluate machine learning algorithms such as regression, decision trees, random forests, and ensemble methods.

Evaluation Metrics: Rigorous statistical analysis and performance metrics (accuracy, precision, recall, F1-score) assess model effectiveness in predicting rain occurrence, enabling tailored approaches for different cities.

Implications: Improved rainfall prediction benefits agriculture, water resource management, disaster preparedness, and urban planning, aiding farmers, water authorities, and emergency management agencies in optimizing resources and responding to extreme weather events proactively.

Objective: Develop accurate machine learning models for rainfall prediction, addressing class imbalance, missing data, outliers, and feature selection in major cities.

Aim: Enhance forecasting by preprocessing data and comparing models like Logistic Regression, Decision Trees, Neural Networks, Random Forest, and LightGBM.

Motivation: Timely and precise rainfall forecasts reduce losses in extreme weather events, benefitting agriculture, water management, and emergency planning in Australia.

Techniques

Class Imbalance: Addressed with minority class oversampling. download (2) download (3) download (4) download (5)

Missing Data: Imputed using Multiple Imputation by Chained Equations (MICE). download (6) download (7)

Outlier Detection: Identified outliers using the Interquartile Range (IQR) method. download (8) download (9)

download (10)

Feature Selection: Used filter and wrapper methods for selecting relevant features. download (11)

Machine Learning Models: Employed models like Logistic Regression, Decision Trees, Neural Networks, and Random Forest. download (12)

Accuracy = 0.8050146850864789 ROC Area under Curve = 0.805039737453916 Cohen's Kappa = 0.6100470056991374 Time taken = 4.293061256408691 precision recall f1-score support

     0.0    0.79882   0.81390   0.80629     27501
     1.0    0.81141   0.79618   0.80372     27657

accuracy                        0.80501     55158

macro avg 0.80512 0.80504 0.80501 55158 weighted avg 0.80513 0.80501 0.80500 55158 download (13)

Confusion matrix, without normalization Accuracy = 0.8666195293520432 ROC Area under Curve = 0.8665334987138236 Cohen's Kappa = 0.733191586808682 Time taken = 0.7531166076660156 precision recall f1-score support

     0.0    0.88972   0.83612   0.86209     27501
     1.0    0.84625   0.89695   0.87086     27657

accuracy                        0.86662     55158

macro avg 0.86799 0.86653 0.86648 55158 weighted avg 0.86793 0.86662 0.86649 55158 download (14) download (15)

Confusion matrix, without normalization Accuracy = 0.8937053555241307 ROC Area under Curve = 0.8936717401423054 Cohen's Kappa = 0.7873950784904907 Time taken = 484.4572079181671 precision recall f1-score support

     0.0    0.90276   0.88179   0.89215     27501
     1.0    0.88511   0.90556   0.89522     27657

accuracy                        0.89371     55158

macro avg 0.89393 0.89367 0.89368 55158 weighted avg 0.89391 0.89371 0.89369 55158 download (16) download (17)

Confusion matrix, without normalization Accuracy = 0.9234562529460821 ROC Area under Curve = 0.9233814961038466 Cohen's Kappa = 0.8468885765844757 Time taken = 48.85527777671814 precision recall f1-score support

     0.0    0.94673   0.89695   0.92117     27501
     1.0    0.90262   0.94981   0.92562     27657

accuracy                        0.92346     55158

macro avg 0.92467 0.92338 0.92339 55158 weighted avg 0.92461 0.92346 0.92340 55158 download (18) download (19)

Confusion matrix, without normalization Accuracy = 0.8728017694622721 ROC Area under Curve = 0.8727177880255684 Cohen's Kappa = 0.7455592851796542 Time taken = 9.703380107879639 precision recall f1-score support

     0.0    0.89572   0.84302   0.86857     27501
     1.0    0.85254   0.90241   0.87677     27657

accuracy                        0.87280     55158

macro avg 0.87413 0.87272 0.87267 55158 weighted avg 0.87407 0.87280 0.87268 55158 download (20) download (21)

Evaluation: Assessed performance with metrics like accuracy, ROC-AUC, and Cohen’s Kappa. download (22) Final Output: ['Rain' 'No Rain' 'Rain' ... 'No Rain' 'No Rain' 'Rain'] Binary Output: ['Rain' 'No Rain' 'Rain' ... 'No Rain' 'No Rain' 'Rain'] Majority Vote: Rain