- The listenings_genre_pyspark.ipynb jupityer notebook contains data analysis carried out on genre and songs dataset using Pyspark. Operations such as cleansing, filters, aggregation and visualization is applied to explore the datasets.
- The call_detail_record_pyspark.ipynb contains data analysis using spark sql and aggregate functions, visualization using matplotlib on a dataset which contains Hourly phone calls, SMS and Internet communication. The dataset can be downloaded from https://www.kaggle.com/marcodena/mobile-phone-activity
-
Notifications
You must be signed in to change notification settings - Fork 0
praveen-gopal-reddy/Data_Analysis_with_Pyspark
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Data analysis using Pyspark sql and aggregate functions.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published