GitHub - connect-midhunr/sentiment-analysis-covid19-related-tweets: Machine learning model that classify a tweet as positive, negative or neutral.

In this project, I have attempted to analyze the Covid-19 related tweets dataset and build a machine learning model to classify a tweet as positive, negative or neutral.

💾 Project Files Description

This project contains an executable iPython Notebook, a presentation and source as follows:

Executable Files:

Sentiment_Analysis_of_Covid_19_related_Tweets.ipynb - Google Colab notebook containing data summary, exploration, visualisations, text processing, modelling and performance evaluation.

Source Directory:

Coronavirus Tweets.csv - Includes Covid-19 related tweets data.

📖 Problem Statement

Since the outbreak of coronavirus, it has affected more than 180 countries where massive losses in the economy and jobs globally and confining about 58% of the global population are caused. The research on people’s feelings is essential for keeping mental health and informed about Covid-19. The given challenge is to build a classification model to predict the sentiment of Covid-19 tweets.

📖 Approach

Understanding the business task.
Reading data from files given.
Data pre-processing.
Data visualization.
Text processing.
Modelling data.
Conclusion.

📖 Text Processing

Lemmatization is used for text normalization since meaning of words is more crucial than the getting base words to determine which class the text data belongs to.

TF-IDF was used for feature extraction from text since just the importance of words also needs to be considered.

📖 Modelling

Four different algorithms were tried out to find out which one performs the best.

Logistic Regression
Random Forest
Naive Bayes
Support Vector Machine

📘: Conclusion

The model built using logistic regression algorithm has the highest accuracy, followed by the one using SVM. Therefore logistic regression model can be used for sentiment analysis.

📜 Credits

Midhun R | Avid Learner | Data Analyst | Data Scientist | Machine Learning Enthusiast

Contact me for Data Science Project Collaborations

📚 References

DataRobot, 'Using Machine Learning for Sentiment Analysis: a Deep Dive'. [Online].

Available: https://www.datarobot.com/blog/using-machine-learning-for-sentiment-analysis-a-deep-dive/
Analytics Vidhya, 'Quick Introduction to Bag-of-Words (BoW) and TF-IDF for Creating Features from Text'. [Online].

Available: https://www.analyticsvidhya.com/blog/2020/02/quick-introduction-bag-of-words-bow-tf-idf/
Scikit-learn, 'A tutorial on statistical-learning for scientific data processing'. [Online].

Available: https://scikit-learn.org/stable/tutorial/statistical_inference/

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Images		Images
Coronavirus Tweets.csv		Coronavirus Tweets.csv
README.md		README.md
Sentiment_Analysis_of_Covid_19_related_Tweets.ipynb		Sentiment_Analysis_of_Covid_19_related_Tweets.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💾 Project Files Description

Executable Files:

Source Directory:

📖 Problem Statement

📖 Approach

📖 Text Processing

📖 Modelling

📘: Conclusion

📜 Credits

📚 References

About

Releases

Packages

Languages

connect-midhunr/sentiment-analysis-covid19-related-tweets

Folders and files

Latest commit

History

Repository files navigation

💾 Project Files Description

Executable Files:

Source Directory:

📖 Problem Statement

📖 Approach

📖 Text Processing

📖 Modelling

📘: Conclusion

📜 Credits

📚 References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages