Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consumes and processes Kafka data, saving it to the Datalake. Airflow orchestrates the pipeline. dbt moves data to Snowflake, transforms it, and creates dashboards.
☆71Dec 17, 2023Updated 2 years ago
Alternatives and similar repositories for spotify-stream-analytics
Users that are interested in spotify-stream-analytics are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 7 months ago
- ☆14Mar 16, 2026Updated 3 weeks ago
- Simple project using pyflink, kafka and postgre containerized using Docker☆11Aug 26, 2024Updated last year
- This Power BI project provides insights into customer orders and product tracking using interactive dashboards. It visualizes order statu…☆10Aug 15, 2025Updated 7 months ago
- This repository contains notebooks, homework, projects and notes done during Machine Learning Zoomcamp course.☆13Nov 13, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- This project is for demonstrating knowledge of Data Engineering tools and concepts and also learning in the process☆44Dec 1, 2022Updated 3 years ago
- ☆11Aug 10, 2023Updated 2 years ago
- Local development environment for python data projects, with Docker☆23Dec 14, 2022Updated 3 years ago
- ☆17Apr 19, 2024Updated last year
- ☆12Sep 9, 2023Updated 2 years ago
- A custom end-to-end analytics platform for customer churn☆11May 15, 2025Updated 10 months ago
- End-to-end data platform leveraging the Modern data stack☆52Apr 10, 2024Updated last year
- Here I will be exploring various tools and methods that are used in data engineering process with Python.☆21Jan 4, 2021Updated 5 years ago
- Code test for data engineering candidates☆47Mar 27, 2024Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Candace's Data Engineering Zoomcamp files and notes☆18Jul 4, 2023Updated 2 years ago
- This repo consists of all important concepts for data engineers.☆11Dec 24, 2024Updated last year
- A command line client builder that follows the Canonical's Guidelines for a Command Line Interface.☆15Updated this week
- ☆12Aug 28, 2024Updated last year
- An AWS Data Engineering End-to-End Project (Glue, Lambda, Kinesis, Redshift, QuickSight, Athena, EC2, S3)☆16Sep 20, 2023Updated 2 years ago
- An open and introductory book for the Python API of Apache Spark (pyspark) 📚📖☆12Sep 19, 2025Updated 6 months ago
- This project demonstrates how to build and automate an ETL pipeline written in Python and schedule it using open source Apache Airflow or…☆20Aug 21, 2025Updated 7 months ago
- Code for my "Efficient Data Processing in SQL" book.☆61Aug 6, 2024Updated last year
- AWS LocalStack + Spark Cluster + Zeppelin [Docker]☆10Jul 6, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A sphinx extension for adding pyscript to a page☆15Updated this week
- The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technolog…☆13Jun 26, 2022Updated 3 years ago
- Data Engineer Project: An end-to-end Airflow data pipeline with BigQuery, dbt Soda, and more!☆12Dec 14, 2023Updated 2 years ago
- Scripts used to setup a Spark cluster on EC2☆21Mar 24, 2016Updated 10 years ago
- Analyzing the most strategic words to guess on Wordle, based on letter frequency distributions☆11Feb 20, 2022Updated 4 years ago
- This repo via a real world use case, shows how to launch dbt models from a DAG in Apache Airflow.☆14Apr 24, 2025Updated 11 months ago
- ☆21Nov 4, 2023Updated 2 years ago
- ☆22Feb 5, 2024Updated 2 years ago
- Building Recommender System with the Two-Tower Architecture☆17Aug 10, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆21Oct 21, 2024Updated last year
- A simple and easy to use Data Quality (DQ) tool built with Python.☆51Sep 7, 2023Updated 2 years ago
- A data and analytics engineering platform designed for real-time sports betting analytics.☆49Mar 21, 2025Updated last year
- ☆16May 29, 2023Updated 2 years ago
- ☆13Jul 8, 2024Updated last year
- ☆16Feb 17, 2020Updated 6 years ago
- Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboa…☆267Jan 1, 2023Updated 3 years ago