Skip to content

In this project, I embarked on an exploratory journey through a dataset of U.S. medical insurance costs, aiming to uncover insights into the factors that influence these costs.

Notifications You must be signed in to change notification settings

Mr-Mens/Medical-Insurance-Data-Analysis-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

U.S. Medical Insurance Costs Analysis Project README

Introduction

In this project, I embarked on an exploratory journey through a dataset of U.S. medical insurance costs, aiming to uncover insights into the factors that influence these costs. Utilizing the Python programming language and its csv library, I analyzed various aspects of the data, including patient demographics, smoking status, region, and the number of children. My primary objectives were to understand the average age of the patients, regional distribution, the impact of smoking on insurance costs, and the average age of patients with at least one child.

Data Preparation

The first step in my analysis involved loading the data from a CSV file named insurance.csv. I created lists for each attribute I intended to analyze: ages, sexes, BMIs, number of children, smoker status, regions, and insurance charges. Using the load_list_info function, I populated these lists with data from the corresponding columns in the CSV file. This function streamlined the process of extracting information and set the stage for a more detailed analysis.

Analysis Overview

With the data successfully loaded into Python lists, I proceeded to delve into the analysis phase of the project. Here's a summary of the analyses I conducted:

1. Average Age of Patients

I calculated the average age of the patients in the dataset to get a sense of the demographic we're dealing with. This involved summing all the ages and dividing by the total number of patients.

2. Regional Analysis

I analyzed the distribution of patients across different regions to identify where the majority of individuals are from. This required identifying unique regions in the dataset and counting occurrences.

3. Smoking vs. Non-Smoking Costs

I compared the insurance costs between smokers and non-smokers to evaluate the financial impact of smoking on medical insurance. This involved segmenting the data by smoking status and calculating average costs for each group.

4. Average Age of Patients with Children

Finally, I determined the average age of patients who have at least one child. This analysis aimed to explore how having dependents might relate to the age of the policyholders.

Insights and Findings

Through the implementation of the PatientsInfo class and its methods, I systematically approached each analysis goal. The class's methods enabled me to efficiently analyze ages, distinguish sexes, identify unique regions, calculate average charges, and create a comprehensive dictionary of all patient information. This dictionary became a versatile tool for further exploration and analysis.

Some key insights from the project include:

  • The average age of patients in the dataset, providing a window into the age distribution of those seeking medical insurance.
  • The regional distribution of patients, highlighting areas with higher concentrations of insured individuals.
  • A clear disparity in insurance costs between smokers and non-smokers, underscoring the financial consequences of smoking.
  • The average age of patients with at least one child, offering perspective on the demographic of policyholders with dependents.

Conclusion

This project was not just an exercise in data manipulation but an enlightening exploration into the factors affecting U.S. medical insurance costs. By applying Python programming skills and analytical thinking, I uncovered valuable insights that could inform both individuals and policymakers about the dynamics of medical insurance pricing. The findings underscore the significance of lifestyle choices, demographic factors, and regional distribution in the cost of healthcare.

About

In this project, I embarked on an exploratory journey through a dataset of U.S. medical insurance costs, aiming to uncover insights into the factors that influence these costs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published