Skip to content
This repository has been archived by the owner on Jun 23, 2023. It is now read-only.

This project will analyze salaries of Data Scientists using Python/Jupyter Notebook.

Notifications You must be signed in to change notification settings

KimKarydas/Data-Science-Job-Salaries

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Alt Text

Data-Science-Job-Salaries

Welcome to the Data Science Job Salaries EDA project! This project aims to demonstrate the power of exploratory data analysis (EDA) in the field of data science. By analyzing a dataset containing information about data science job salaries, we can gain valuable insights into the industry and showcase the potential of data-driven decision-making.

Dataset Overview

The dataset used in this project is called "Data Science Job Salaries." It consists of 606 rows and includes the following columns:

  • experience_level: The level of experience required for the job.
  • salary: The salary information in the original currency.
  • salary_in_usd: The salary converted to US dollars.
  • employment_type: The type of employment (e.g., full-time, part-time, contract).
  • job_title: The original job title.
  • salary_currency: The currency in which the salary is provided.
  • employee_residence: The abbreviation of the employee's country of residence.
  • company_location: The location of the company.
  • company_size: The size of the company.

Job Title Clustering

During the analysis, it was observed that there were numerous job titles that were similar to each other. To simplify the job titles and gain a broader understanding of the data, a machine learning clustering model was employed. The job titles were clustered into seven major categories, and the results were stored in the cluster_job_title column.

Project Motivation

Exploring the field of data science is an exciting journey that offers endless opportunities for learning and growth. Through this project, I aim to showcase the power of EDA in extracting valuable insights from complex datasets. By utilizing machine learning techniques like clustering, we can simplify and categorize data, making it more accessible and meaningful.

Remember, data has stories to tell, and I am here to listen and decipher them.

Happy exploring!

About

This project will analyze salaries of Data Scientists using Python/Jupyter Notebook.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published