Skip to content

Collection of data analysis and data engineering projects

License

Notifications You must be signed in to change notification settings

sarathchandrikak/Data-Projects

Repository files navigation

Data-Engineering, Data-Analysis Projects

Projects-Portfolio

Welcome to my data portfolio! Here, are projects with Python, SQL, R, PySpark, Tableau

📚 Table of Contents

Data Engineering

Project Link Tools Project Description
dbt airflow pipeline Python, Snowflake, Airflow, dbt Developed a data pipeline to build fact and dimension tables for the snowlake database snowflake-sample-data
🏦 Spar Nord Bank Transaction Python, AWS, EC2, RedShift, SQOOP Developed a data pipeline utilizing ETL a batch ETL pipeline to read transactional data from RDS, transform and load it into target dimensions and facts on Redshift Data Mart

DE System Design

Project Link Project Description Components Designed

Python PySpark

Project Link Area Project Description Libraries
👩🏻‍💻 Absenteesim Analysis Programming Analysing the reason and probabilities various conditions for maximum absenteeism in employees of a company. pandas, numpy, scikit-learn
📺 NYC Airbnb Analysis Data Wrangling & EDA Analysis on multiple files to distinguish Airbnb prices across NYC pandas

SQL

Project Link Area of Analysis Project Description
🛍 Serious SQL Data Analysis Queries Apprenticeship of SQL
👩‍💼 Employee Info Employee Info Analysis Analyis on Employee database implementing all the concepts of SQL
🎦 IMDB Movie Data cleaning, transformation, Analysis Analysis of RSVP Indian film production data of past 3 years Movie data to release a movie for global audience

EDA, Data Visualisation, Data Analysis

Project Link Project Description
Netflix Dashboard Netflox Movies and TV Show Analysis Analysis of Netflix data on recent movies and tv shows
IT Technical Issues Dashboard IT Ticket Info Analysis Detailed analysis of tickets booked, resolved in time