This repository consists of the jupyter files for analysis on the TMDB 5000 Movies dataset.
Report: Link
main.ipynb: Exploratory data analysis where we analyze the trends between different attributes in the dataset.
KNN_Regression.ipynb: KNN regression on movie revenue.
KNN_Classification.ipynb: KNN classification on movie revenue while comparing it across different bucket sizes.
Overfitting.ipynb: Looking at overfitting and underfitting with KNN regression on movie revenue.
Irrelevant.ipynb: Hypothesis testing for significance of irrelevant deemed attributes.
Autocorrelation.ipynb: Hypothesis testing for Spatial Autocorrelation in No. of movies and Average rating.
OTT.ipynb: Test availability of movies on diferent OTT Platforms using self defined metrics.