Skip to content

This project aims to make a quantitative analysis of the New York City Taxi and Limousine Service (TLC) Trip Record Data.

License

Notifications You must be signed in to change notification settings

greysonchung/New-York-Taxi-Data-Analysis

Repository files navigation

MAST30034 Exporatory Analysis of New York City Taxi

As one of the most populous cities in the United States, New York City witnesses millions of taxi trips every month. This project aims to conduct a quantitative analysis of the New York City Taxi and Limousine Commission (TLC) trip record data to gain a better understanding of it. Additionally, we aim to provide recommendations that might improve taxi drivers' income.

Dependencies

  • Language: Python 3.8.8
  • Python Packages / Libraries: pandas, geopandas, numpy, matplotlib, seaborn, scipy, sklearn, statsmodels, contextily

Datasets

Directory

  • raw_data: Contains all the raw data files. Added to .gitignore
  • preprocessed_data: Contains all the preprocessed data files. Added to .gitignore
  • plots: Contains all visualisation plot for the project.
  • deprecated: Contains all the old code that I don't use anymore.
  • code: Contains notebooks for Preprocessing, Visualisation, and Modelling.
    • download.ipynb for "Downloading" trip record datasets.
    • preprocessing.ipynb for "Preprocessing" and "Exploratory Data Analysis".
    • visualisation.ipynb for "Analysis and Visualisation".
    • modelling.ipynb for "Statistical Modelling".
  • To reproduce the results, simply download all the dataset and run each notebook.

About

This project aims to make a quantitative analysis of the New York City Taxi and Limousine Service (TLC) Trip Record Data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published