Hello Everyone,
Here is My EDA Project on IRIS Dataset where I analyzed the Data by using Seaborn and Matplotlib.
This Dataset consists of 3 different types of Iris Flower (Setosa, Versicolour and Virginica).
The information includes its Petal Lenght, Sepal Length, Petal Width, Sepal Width and Variety of the Flower.
Link to the Dataset : Iris Dataset
-
The objective of this Project is to perform Exploratory Data Analysis (EDA) on the Iris dataset.
-
The Iris dataset is a popular and well-known dataset in the field of machine learning and statistics.
-
It consists of measurements of four features sepal length, sepal width, petal length and petal width of three different species of iris flowers : setosa, versicolor and virginica.
-
The goal of this EDA is to gain insights into the dataset, understand the relationships between the features and extract meaningful information that can aid in further analysis or modeling tasks.
- Setting up the Enviroment
- Libraries required for the Project
- Getting started with Repository
- Steps involved in the Project
- Conclusion
- Link to the Notebook
Jupyter Notebook is required for this project and you can install and set it up in the terminal.
- Install the Notebook
pip install notebook
- Run the Notebook
jupyter notebook
Pandas
- Go to Terminal and run this Code
pip install pandas
- Go to Jupyter Notebook and run this Code from a Cell
!pip install pandas
Matplotlib
- Go to Terminal and run this Code
pip install matplotlib
- Go to Jupyter Notebook and run this Code from a Cell
!pip install matplotlib
Seaborn
- Go to Terminal and run this Code
pip install seaborn
- Go to Jupyter Notebook and run this Code from a Cell
!pip install seaborn
- Clone the repository to your local machine using the following command :
git clone https://github.com/TheMrityunjayPathak/ExploratoryDataAnalysis.git
-
Importing libraries required for Project
-
Reading CSV File
-
Exploring the Dataset
-
Checking Null Values in Dataset
-
Splitting the Dataset based on Species of Flower
Data Visualization
- Count Plot on Species of Flower in the Dataset
-
Scatter Plot on sepal length and sepal width categorized by Species of Flower
-
Scatter Plot on petal length and petal width categorized by Species of Flower
- Pair Plot on the entire IRIS Dataset
- Distribution of Sepal Length and Petal Length of Different Species of Flower
- Box Plot on sepal length and petal length categorized by Species of Flower
- Heat Map on the entire IRIS Dataset
-
In conclusion, the exploratory data analysis (EDA) conducted on the Iris dataset using Seaborn and Matplotlib has provided valuable insights into the dataset's characteristics and relationships between variables.
-
The EDA revealed that the Iris dataset consists of 150 samples, each representing a different Iris flower with four features : sepal length, sepal width, petal length and petal width.
-
The dataset is balanced with 50 samples for each of the three Iris species : Setosa, Versicolor and Virginica.
-
Using Seaborn and Matplotlib, we created various plots to explore the dataset.
Scroll to Top ⬆️ |
---|