Fake News Detection - ML IA 2

The team members are:

Purvi Harniya, 1814023
Neelay Jagani, 1814024
Esha Gupta,1814025

INTRODUCTION

Data has been increasing at an unprecedented range in an exponential manner and is producing 2.7 quintillion bytes of data everyday.

The definition of fake news is information that pushes people down the wrong road. Fake news is spreading like wildfire these days, and people are sharing it without confirming it. This is frequently done to promote or impose specific views, and it is frequently accomplished through political agendas.

As a result, it is vital to recognise fake news.

PROBLEM DEFINITION

Fake News have become more prevalent in recent years and with great amount of dynamism in internet and social media, differentiating between facts and opinions, relating to commercial or political upheavals has become more difficult than ever.

Fake information is purposely or unintentionally spread throughout the internet. The massive dissemination of fake news has left an indelible mark on people and culture.

We use various NLP and preprocessing methodologies like tokenization, stop words removal, lemmatization, stemming and machine learning classification algorithms - logistic regression, pac, ada, naive bayes, svm, random forest, xgboost, decision trees and rnn, to build a model that differentiates between fake news and real news and also analyze the performance of these various classification methodologies to choose the best classifier on out dataset.

IMPLEMENTATION DETAILS

The ISOT Fake news dataset was downloaded from https://www.uvic.ca/ecs/ece/isot/datasets/index.php.

A comparison of various classification algorithms to determine the best for our dataset. The results are as follows:

	Models	Accuracy	Precision	F1 Score	Recall
1	Logstic Regression	0.987973	0.986926	0.987387	0.987848
2	ADA	0.988241	0.987661	0.987661	0.987661
3	PAC	0.995724	0.994958	0.995516	0.996074
4	XGB	0.990468	0.994342	0.989954	0.985605
5	RF	0.984677	0.986835	0.983874	0.980931
6	Naive Bayes	0.952339	0.951087	0.949930	0.948775
7	SVM	0.994833	0.993287	0.994586	0.995887
8	DT	0.985835	0.986684	0.985114	0.983548
9	RNN	0.992428	0.994877	0.992104	0.989347

Conclusion

We downloaded the ISOT dataset from https://www.uvic.ca/ecs/ece/isot/datasets/index.php and uploaded it to our drive, and then loaded it and preprocessed it using various NLP algorithms like tokenization, stop words removal, lemmatization and stemming. We vectorized the text documents using count vectorizer and tf-idf vectorizer. After preprocessing, we split the data into testing and training and we built nine models using nine different classification algorithms and used the predictions to calculate the performance metrics. The details of each are given below:

From the ROC curve and the bar plot which compares the performance of all the models, we conclude that the SVM (accuracy-99.48%, precision - 99.33%, f1 score-99.46%, recall-99.59%) is the best algorithm on our ISOT dataset for the task of fake news detection and classification.

The main notebook file is 'Group6_Code_ML_IA2_Implementation.ipynb' - https://github.com/Purviharniya/Fake-news-detection/blob/master/Group6_Code_ML_IA2_Implementation.ipynb
The dataset is available in the dataset folder - https://github.com/Purviharniya/Fake-news-detection/tree/master/dataset
The notebook file's pdf is available at - https://github.com/Purviharniya/Fake-news-detection/blob/master/Group6_CodePDF_ML_IA2.pdf
The ppt is available at - https://github.com/Purviharniya/Fake-news-detection/blob/master/Group6_PPT_ML_IA2.pptx
The summary document is available at - https://github.com/Purviharniya/Fake-news-detection/blob/master/Group6_Document_FakeNewsDetection_ML_IA2_EXP8.docx
The screen cast is available at - https://github.com/Purviharniya/Fake-news-detection/blob/master/Group6_Screencast_ML_IA2.mp4
The research paper for IA1 - https://github.com/Purviharniya/Fake-news-detection/blob/master/Group6_Reasearch%20Paper_ML_IA.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fake News Detection - ML IA 2

INTRODUCTION

PROBLEM DEFINITION

IMPLEMENTATION DETAILS

Conclusion

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
dataset		dataset
Group6_CodePDF_ML_IA2.pdf		Group6_CodePDF_ML_IA2.pdf
Group6_Code_ML_IA2_Implementation.ipynb		Group6_Code_ML_IA2_Implementation.ipynb
Group6_Document_FakeNewsDetection_ML_IA2_EXP8.docx		Group6_Document_FakeNewsDetection_ML_IA2_EXP8.docx
Group6_PPT_ML_IA2.pptx		Group6_PPT_ML_IA2.pptx
Group6_Reasearch Paper_ML_IA.pdf		Group6_Reasearch Paper_ML_IA.pdf
Group6_Screencast_ML_IA2.mp4		Group6_Screencast_ML_IA2.mp4
README.md		README.md

Purviharniya/Fake-news-detection

Folders and files

Latest commit

History

Repository files navigation

Fake News Detection - ML IA 2

INTRODUCTION

PROBLEM DEFINITION

IMPLEMENTATION DETAILS

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages