Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consumes and processes Kafka data, saving it to the Datalake. Airflow orchestrates the pipeline. dbt moves data to Snowflake, transforms it, and creates dashboards.
☆72Dec 17, 2023Updated 2 years ago
Alternatives and similar repositories for spotify-stream-analytics
Users that are interested in spotify-stream-analytics are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12May 27, 2024Updated 2 years ago
- ⚙️ Airflow data pipeline with Terraform, GCP BigQuery, dbt, Soda and Looker Studio.☆26Oct 19, 2023Updated 2 years ago
- Streaming analytics project with eventsim and Kafka☆13Dec 23, 2022Updated 3 years ago
- This repository contains the capstone project carried out as part of Machine Learning Zoomcamp course☆10Dec 26, 2022Updated 3 years ago
- Public data and analytics for our open course☆34Mar 22, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This Power BI project provides insights into customer orders and product tracking using interactive dashboards. It visualizes order statu…☆10Aug 15, 2025Updated 10 months ago
- This repository contains notebooks, homework, projects and notes done during Machine Learning Zoomcamp course.☆13Nov 13, 2024Updated last year
- This project is for demonstrating knowledge of Data Engineering tools and concepts and also learning in the process☆44Dec 1, 2022Updated 3 years ago
- Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision Language Models☆21Oct 12, 2025Updated 8 months ago
- Local development environment for python data projects, with Docker☆23Dec 14, 2022Updated 3 years ago
- Testing Spark Structured Streaming anf Kafka with real data from traffic sensors☆17Nov 11, 2022Updated 3 years ago
- ☆17Apr 19, 2024Updated 2 years ago
- A custom end-to-end analytics platform for customer churn☆10May 15, 2025Updated last year
- This is a capstone project associated with MLOps Zoomcamp. The end goal of the project is to build an end-to-end machine learning projec…☆13Sep 8, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The goal of this project is to build an ETL pipeline. The data would be processed as a batch (monthly) between 2018-01 and 2021-02.☆14Mar 26, 2022Updated 4 years ago
- Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆55Sep 30, 2023Updated 2 years ago
- Here I will be exploring various tools and methods that are used in data engineering process with Python.☆21Jan 4, 2021Updated 5 years ago
- Using Apache Spark SQL, Spark ML, Pandas to analyse and predict using the Chicago crime dataset☆10Apr 6, 2018Updated 8 years ago
- 🤖 An autonomous AI agent system that collaboratively designs, implements, and manages Apache Airflow DAGs through natural language inter…☆28Aug 6, 2025Updated 10 months ago
- Spark application to consume kafka events generated by a python producer.☆12Aug 7, 2021Updated 4 years ago
- Candace's Data Engineering Zoomcamp files and notes☆18Jul 4, 2023Updated 2 years ago
- 爬蟲監控維護和擴展的管理後台 Flask + Nginx + Docker☆15Dec 8, 2022Updated 3 years ago
- This repo consists of all important concepts for data engineers.☆11Jun 2, 2026Updated 3 weeks ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 네이버 쇼핑 리뷰 데이터를 통해 감성 분석하기(GRU, LSTM)☆10Sep 27, 2021Updated 4 years ago
- An AWS Data Engineering End-to-End Project (Glue, Lambda, Kinesis, Redshift, QuickSight, Athena, EC2, S3)☆16Sep 20, 2023Updated 2 years ago
- ☆14Apr 28, 2022Updated 4 years ago
- An open and introductory book for the Python API of Apache Spark (pyspark) 📚📖☆12Sep 19, 2025Updated 9 months ago
- Data Augmentation with Python, published by Packt☆37Oct 28, 2024Updated last year
- Para entender e aprender um pouco sobre o Apache Kafka.https://www.youtube.com/channel/UC3pevgVzUWKo5CoWdhDsoHw☆14Mar 10, 2026Updated 3 months ago
- A sphinx extension for adding pyscript to a page☆15Jun 22, 2026Updated last week
- 📦 Starting box for Vagrant. Inside box Ubuntu 20.04 LTS with Git, Docker and Docker compose.☆19May 5, 2022Updated 4 years ago
- AWS LocalStack + Spark Cluster + Zeppelin [Docker]☆10Jul 6, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- GitHub Actions to Validate DAGs, Variables and Dependencies upon Pull Request☆23Mar 5, 2026Updated 3 months ago
- ☆15Oct 19, 2023Updated 2 years ago
- (Python, PySpark)☆11Nov 15, 2020Updated 5 years ago
- Analyzing the most strategic words to guess on Wordle, based on letter frequency distributions☆11Feb 20, 2022Updated 4 years ago
- Set of Jupyter notebooks demonstrating Learning to Rank integrated with Solr and Elasticsearch☆17Jun 19, 2022Updated 4 years ago
- ☆21Nov 4, 2023Updated 2 years ago
- ☆23Feb 5, 2024Updated 2 years ago