Skip to content

This analysis is part of our comprehensive statistics course project. This chapter encompasses critical topics such as goodness of fit., test for independence, and contingency tables with Yates correction. These concepts are essential for examining data relationships and validating statistical models.

Notifications You must be signed in to change notification settings

Arjun-08/STATISTICS-UNIT.04

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

STATISTICS-UNIT.04: Categorical Data Analysis and Goodness of Fit Tests

Overview of Statistics - Unit 04

The analysis is conducted under the guidance of Dr. Ramesh Athe, Assistant Professor of DS&IS at IIIT Dharwad.In our statistics course project, each team member focused on a specific chapter. I delved into Chapter 4, which covers crucial topics like goodness of fit, test for independence, and contingency tables with Yates correction. These concepts are fundamental for analyzing data relationships and model validity. By exploring these topics, we gain practical skills in data analysis and interpretation using Python.

Dataset Description:

The dataset comprises various attributes related to football players, including player ID, nationality, squad, birth year, value, height, position, preferred foot, league, and more.

Analysis:

The case study involves conducting various statistical tests and analyses, including goodness-of-fit tests and tests for independence. Python libraries such as Pandas, NumPy, Matplotlib are utilized for data manipulation, visualization, and statistical analysis.

This Jupyter Notebook contains Python code for conducting categorical data analysis and performing goodness of fit tests on a dataset related to football players. The notebook includes the following sections:

  1. Import Functions: This section imports necessary Python libraries such as NumPy, Pandas, Matplotlib, and more. It also reads the dataset from a CSV file hosted on GitHub.

  2. Goodness of Fit: In this part, a hypothesis test is conducted to determine if football players have a preference for scoring goals among four different positions in 2018. The null and alternative hypotheses are defined, and the observed values are compared with the expected values. A chi-square test statistic is calculated, and based on the significance level (α=0.05), the null hypothesis is either rejected or accepted.

  3. Test for Independence: Here, another hypothesis test is performed to assess if the goal-scoring patterns in 2018 and 2019 are independent of the players' positions. The observed and expected values are computed, and a chi-square test statistic is calculated to determine whether the two variables are independent or not.

  4. 2x2 Contingency Table and Yates Correction for Contingency: This section involves testing the claim that the proportion of players from two different leagues who received yellow cards is equal. The data are arranged in a 2x2 contingency table, and a chi-square test statistic is computed using Yates' correction for continuity. Based on the test result and the significance level (α=0.05), the null hypothesis is either accepted or rejected.

For each analysis, hypotheses are formulated and tested using appropriate statistical methods, with significance levels and conclusions drawn accordingly.

Disclaimer:

The dataset is provided for educational and analytical purposes only. Any conclusions drawn from the analysis are based on statistical interpretations and may not represent real-world scenarios accurately.

Contact

If you have any questions or suggestions, please feel free to reach out to me at [email protected].

About

This analysis is part of our comprehensive statistics course project. This chapter encompasses critical topics such as goodness of fit., test for independence, and contingency tables with Yates correction. These concepts are essential for examining data relationships and validating statistical models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published