Skip to content

TheMrityunjayPathak/ExploratoryDataAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis

Hello Everyone,

Here is My EDA Project on IRIS Dataset where I analyzed the Data by using Seaborn and Matplotlib.

Dataset

This Dataset consists of 3 different types of Iris Flower (Setosa, Versicolour and Virginica).

The information includes its Petal Lenght, Sepal Length, Petal Width, Sepal Width and Variety of the Flower.

Link to the Dataset : Iris Dataset

Problem Statement

  • The objective of this Project is to perform Exploratory Data Analysis (EDA) on the Iris dataset.

  • The Iris dataset is a popular and well-known dataset in the field of machine learning and statistics.

  • It consists of measurements of four features sepal length, sepal width, petal length and petal width of three different species of iris flowers : setosa, versicolor and virginica.

  • The goal of this EDA is to gain insights into the dataset, understand the relationships between the features and extract meaningful information that can aid in further analysis or modeling tasks.

Table of Contents

Setting up the Enviroment

Jupyter Notebook is required for this project and you can install and set it up in the terminal.

  • Install the Notebook
pip install notebook
  • Run the Notebook
jupyter notebook

Libraries required for the Project

Pandas

  • Go to Terminal and run this Code
pip install pandas
  • Go to Jupyter Notebook and run this Code from a Cell
!pip install pandas

Matplotlib

  • Go to Terminal and run this Code
pip install matplotlib
  • Go to Jupyter Notebook and run this Code from a Cell
!pip install matplotlib

Seaborn

  • Go to Terminal and run this Code
pip install seaborn
  • Go to Jupyter Notebook and run this Code from a Cell
!pip install seaborn

Getting Started

  • Clone the repository to your local machine using the following command :
git clone https://github.com/TheMrityunjayPathak/ExploratoryDataAnalysis.git

Steps involved in the Project

  • Importing libraries required for Project

  • Reading CSV File

  • Exploring the Dataset

  • Checking Null Values in Dataset

  • Splitting the Dataset based on Species of Flower

Data Visualization

  • Count Plot on Species of Flower in the Dataset

download

  • Scatter Plot on sepal length and sepal width categorized by Species of Flower

  • Scatter Plot on petal length and petal width categorized by Species of Flower

download

download

  • Pair Plot on the entire IRIS Dataset

download

  • Distribution of Sepal Length and Petal Length of Different Species of Flower

download

download

  • Box Plot on sepal length and petal length categorized by Species of Flower

download

download

  • Heat Map on the entire IRIS Dataset

download

Conclusion

  • In conclusion, the exploratory data analysis (EDA) conducted on the Iris dataset using Seaborn and Matplotlib has provided valuable insights into the dataset's characteristics and relationships between variables.

  • The EDA revealed that the Iris dataset consists of 150 samples, each representing a different Iris flower with four features : sepal length, sepal width, petal length and petal width.

  • The dataset is balanced with 50 samples for each of the three Iris species : Setosa, Versicolor and Virginica.

  • Using Seaborn and Matplotlib, we created various plots to explore the dataset.

Link to the Notebook

Exploratory Data Analysis

Scroll to Top ⬆️