abdkumar / spotify-stream-analytics
Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consumes and processes Kafka data, saving it to the Datalake. Airflow orchestrates the pipeline. dbt moves data to Snowflake, transforms it, and creates dashboards.
☆66Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for spotify-stream-analytics
- ☆130Updated 2 years ago
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆241Updated 4 months ago
- Sample project to demonstrate data engineering best practices☆166Updated 8 months ago
- Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews☆70Updated 6 months ago
- End to end data engineering project☆51Updated 2 years ago
- ☆128Updated last year
- Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard.☆197Updated last year
- Stream processing pipeline from Finnhub websocket using Spark, Kafka, Kubernetes and more☆294Updated 11 months ago
- The resources of the preparation course for Databricks Data Engineer Associate certification exam☆280Updated 3 months ago
- Code for "Efficient Data Processing in Spark" Course☆245Updated last month
- This is a template you can use for your next data engineering portfolio project.☆163Updated 3 years ago
- This repo contains "Databricks Certified Data Engineer Associate" Questions and related docs.☆88Updated 3 months ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆18Updated 2 months ago
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆133Updated 4 years ago
- ☆113Updated last month
- The resources of the preparation course for Databricks Data Engineer Professional certification exam☆86Updated last month
- ☆40Updated 10 months ago
- Code for "Advanced data transformations in SQL" free live workshop☆65Updated 3 weeks ago
- Open Source LeetCode for PySpark, Spark, Pandas and DBT/Snowflake☆106Updated this week
- This project is for demonstrating knowledge of Data Engineering tools and concepts and also learning in the process☆45Updated last year
- Git Repository☆131Updated last year
- Supplementary Materials for the The Complete dbt (Data Build Tool) Bootcamp Udemy course☆458Updated last month
- Ravi Azure ADB ADF Repository☆64Updated 6 months ago
- Near real time ETL to populate a dashboard.☆70Updated 5 months ago
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆92Updated 3 months ago
- ☆21Updated 7 months ago
- ☆51Updated 11 months ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆67Updated 3 months ago
- Simple stream processing pipeline☆92Updated 5 months ago
- ☆90Updated 2 years ago