Welcome to the Data Science Job Salaries EDA project! This project aims to demonstrate the power of exploratory data analysis (EDA) in the field of data science. By analyzing a dataset containing information about data science job salaries, we can gain valuable insights into the industry and showcase the potential of data-driven decision-making.
The dataset used in this project is called "Data Science Job Salaries." It consists of 606 rows and includes the following columns:
- experience_level: The level of experience required for the job.
- salary: The salary information in the original currency.
- salary_in_usd: The salary converted to US dollars.
- employment_type: The type of employment (e.g., full-time, part-time, contract).
- job_title: The original job title.
- salary_currency: The currency in which the salary is provided.
- employee_residence: The abbreviation of the employee's country of residence.
- company_location: The location of the company.
- company_size: The size of the company.
During the analysis, it was observed that there were numerous job titles that were similar to each other. To simplify the job titles and gain a broader understanding of the data, a machine learning clustering model was employed. The job titles were clustered into seven major categories, and the results were stored in the cluster_job_title column.
Exploring the field of data science is an exciting journey that offers endless opportunities for learning and growth. Through this project, I aim to showcase the power of EDA in extracting valuable insights from complex datasets. By utilizing machine learning techniques like clustering, we can simplify and categorize data, making it more accessible and meaningful.
Remember, data has stories to tell, and I am here to listen and decipher them.
Happy exploring!