GitHub

Spam Filtering

Dataset: https://www.kaggle.com/datasets/karthickveerakumar/spam-filter

Problem of statement

We have the dataset that contains text information and labels (spam or not spam). I want to apply the text processing and simple classifer to determine which one is spam email or not.

Aproach

Text Processing + TF-IDF + Logistic Regression Model

TF-IDF: Convert each sentence into a vector.
Logistic Regression Model: Classify the spam/no_spam email.

Libraries

Scikit-learn

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
SpamFiltering.ipynb		SpamFiltering.ipynb
emails.csv		emails.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Filtering

Problem of statement

Aproach

Libraries

About

Releases

Packages

Languages

leviethung2103/SpamEmailClassification

Folders and files

Latest commit

History

Repository files navigation

Spam Filtering

Problem of statement

Aproach

Libraries

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages