Kaggle: Titanic Machine Learning Competition with PySpark

--

About

I just did a course about PySpark and this notebook is my first attempt at working with it and learn how it can be used for EDA and machine learning.

PySpark is an interface for Apache Spark in Python that allows you to write Spark applications using Python APIs and is helpful for working with real-time and large-scale data.

The Titanic Machine Learning Competetion

This project is based on the Titanic dataset provided on the Titanic ML challenge on Kaggle. Its task is to build a machine learning model that can tell us if passengers were more likely to survive or not according to their data, such as socio-economic class, age, and gender.

The evaluation method for this model will be the accuracy score i.e the total percentage of correctly predicted passengers.

This is a binary classification problem and the classes used for predications are 1 for survived and 0 for deceased.

I used PySpark for an exploratory data analysis, data cleansing and to build logistic regression, random forest classifier and GBTClassifier models.

Libraries used

PySpark

Kaggle

You can also see this notebook on Kaggle. Just click here to see it.

Author

Luís Fernando Torres

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
titanic-ml-competition-with-pyspark.ipynb		titanic-ml-competition-with-pyspark.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kaggle: Titanic Machine Learning Competition with PySpark

About

The Titanic Machine Learning Competetion

Libraries used

Kaggle

Author

About

Releases

Packages

Languages

luuisotorres/Kaggle-Titanic-Machine-Learning-Competition-with-PySpark

Folders and files

Latest commit

History

Repository files navigation

Kaggle: Titanic Machine Learning Competition with PySpark

About

The Titanic Machine Learning Competetion

Libraries used

Kaggle

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages