- This project is completed by Alex Mak, Zheng En Than, Emily Au, Tony Yuan, and Tahya Weiss-Gibbons between November 2023 and January 2024.
- This project is initially served as a statisitcal analysis project completed in a group, in STAT 537 (Statistical Methods for Applied Research II) class taught in University of Alberta back in December 2023.
- This project is further modified by Alex Mak to add data visualizations using Tableau, as well as using extensions of linear regression model (LASSO and ridge regressions) to perform variable selection for obtaining the final linear model.
- The Data folder stores 2 files, input.csv and processedInput.csv
- input.csv is the original dataset file obtained from UC Irvine Machine Learning Repository (https://archive.ics.uci.edu/dataset/360/air+quality).
- processedInput.csv is the processed dataset once data processesing is performed in this project.
- The Visualziations folder stores 2 main types of files.
- Visualuizations.twb is the tableau workbook that builds all of the visualizations for this project.
- The rest of the files (which started with a number) are the visualization generated from Visualuizations.twb
- Codebase.R is the R codebase where all of the project's task (exploratory data analysis, statistical analysis and inferences) are executed here.
- Readme.md is the file you are reading now :)