abdkumar / spotify-stream-analyticsLinks
Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consumes and processes Kafka data, saving it to the Datalake. Airflow orchestrates the pipeline. dbt moves data to Snowflake, transforms it, and creates dashboards.
☆68Updated last year
Alternatives and similar repositories for spotify-stream-analytics
Users that are interested in spotify-stream-analytics are comparing it to the libraries listed below
Sorting:
- Code for "Efficient Data Processing in Spark" Course☆319Updated last month
- ☆141Updated 2 years ago
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆265Updated 11 months ago
- Sample project to demonstrate data engineering best practices☆194Updated last year
- Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboa…☆231Updated 2 years ago
- ☆151Updated 3 years ago
- ☆133Updated 4 months ago
- Stream processing pipeline from Finnhub websocket using Spark, Kafka, Kubernetes and more☆349Updated last year
- Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews☆150Updated last year
- Code for dbt tutorial☆156Updated 3 weeks ago
- This is a template you can use for your next data engineering portfolio project.☆177Updated 3 years ago
- Simple stream processing pipeline☆102Updated last year
- Local Environment to Practice Data Engineering☆142Updated 6 months ago
- This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.☆83Updated 10 months ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Updated 9 months ago
- End to end data engineering project☆57Updated 2 years ago
- ☆353Updated 5 months ago
- Data Engineering examples for Airflow, Prefect; dbt for BigQuery, Redshift, ClickHouse, Postgres, DuckDB; PySpark for Batch processing; K…☆66Updated last week
- Code for "Advanced data transformations in SQL" free live workshop☆82Updated last month
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆79Updated last year
- Project for "Data pipeline design patterns" blog.☆45Updated 10 months ago
- In this repository we store all materials for dlt workshops, courses, etc.☆195Updated this week
- Code snippets for Data Engineering Design Patterns book☆122Updated 3 months ago
- Main repository to collect notes and scripts written during DataExpert.IO January 2025 bootcamp to help anyone interested.☆25Updated 2 months ago
- Sample repo for startdataengineering DE 101 free course☆64Updated last year
- This repository will contain all of the resources for the Mage component of the Data Engineering Zoomcamp: https://github.com/DataTalksCl…☆99Updated 10 months ago
- ☆51Updated last year
- Step-by-step tutorial on building a Kimball dimensional model with dbt☆143Updated 11 months ago
- Apartments Data Pipeline using Airflow and Spark.☆21Updated 3 years ago
- This project is for demonstrating knowledge of Data Engineering tools and concepts and also learning in the process☆46Updated 2 years ago