Skip to content

Latest commit

 

History

History
51 lines (28 loc) · 1.79 KB

README.md

File metadata and controls

51 lines (28 loc) · 1.79 KB

EDA-survival-of-the-Titanic

This project focuses on Exploratory Data Analysis (EDA) to identify the key determinants that influenced survival during the infamous Titanic accident.

The sinking of the Titanic is one of the most well-known maritime disasters in history. In this project, I delve into the Titanic dataset to uncover patterns and insights that could explain what factors most significantly impacted the chances of survival. Using a combination of statistical analysis and data visualization, this repository aims to provide a comprehensive understanding of the variables at play.

Dataset

The dataset used for this analysis is the Titanic dataset, which includes various features such as:

PassengerId

Survived (target variable)

Pclass (passenger class)

Name

Sex

Age

SibSp (number of siblings/spouses aboard)

Parch (number of parents/children aboard)

Ticket

Fare

Cabin

Embarked (port of embarkation)

Analysis

Data Cleaning: Handling missing values, correcting data types, and ensuring the dataset is ready for analysis.

Exploratory Data Analysis: Generating descriptive statistics and visualizations to understand the distribution and relationships between variables.

Feature Engineering: Creating new features or transforming existing ones to better capture the underlying patterns.

Statistical Analysis: Identifying statistically significant factors affecting survival.

Key findings

Passenger Class: Higher survival rates among passengers in higher classes (Pclass).

Sex: Females had a significantly higher chance of survival compared to males.

Age: Younger passengers had higher survival rates.

Family Size: The number of siblings/spouses and parents/children aboard had varying impacts on survival chances.

Fare: Higher ticket fares were generally associated with higher survival rates.