Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consumes and processes Kafka data, saving it to the Datalake. Airflow orchestrates the pipeline. dbt moves data to Snowflake, transforms it, and creates dashboards.
☆71Dec 17, 2023Updated 2 years ago
Alternatives and similar repositories for spotify-stream-analytics
Users that are interested in spotify-stream-analytics are comparing it to the libraries listed below
Sorting:
- ☆12May 27, 2024Updated last year
- Simple project using pyflink, kafka and postgre containerized using Docker☆11Aug 26, 2024Updated last year
- Building Recommender System with the Two-Tower Architecture☆17Aug 10, 2021Updated 4 years ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 6 months ago
- Here I will be exploring various tools and methods that are used in data engineering process with Python.☆21Jan 4, 2021Updated 5 years ago
- ⚙️ Airflow data pipeline with Terraform, GCP BigQuery, dbt, Soda and Looker Studio.☆24Oct 19, 2023Updated 2 years ago
- Local development environment for python data projects, with Docker☆23Dec 14, 2022Updated 3 years ago
- Data Augmentation with Python, published by Packt☆37Oct 28, 2024Updated last year
- 📦 Starting box for Vagrant. Inside box Ubuntu 20.04 LTS with Git, Docker and Docker compose.☆19May 5, 2022Updated 3 years ago
- A data and analytics engineering platform designed for real-time sports betting analytics.☆48Mar 21, 2025Updated 11 months ago
- DevOps☆16May 17, 2021Updated 4 years ago
- ☆14Nov 10, 2025Updated 3 months ago
- Uncovering User Interest from Biased and Noised Watch Time in Video Recommendation. In Recsys23.☆11Jul 18, 2023Updated 2 years ago
- The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technolog…☆13Jun 26, 2022Updated 3 years ago
- Deep Learning for Computer Vision Practitioner Bundle examples and excercises☆11Apr 23, 2019Updated 6 years ago
- Aspect based sentiment analysis aims to detect an aspect (i.e. features) in a given text and then perform sentiment analysis of the text …☆10Nov 15, 2021Updated 4 years ago
- Analyzing the most strategic words to guess on Wordle, based on letter frequency distributions☆11Feb 20, 2022Updated 4 years ago
- This project is mainly for learning and practicing simple HIVE commands in real time scenarios. Here we have taken some sample coffee sho…☆11Mar 1, 2018Updated 8 years ago
- This project provides a sample code to implement API GW GraphQL API's (sample code uses python grpahql library Graphene) in Lambda functi…☆10May 3, 2021Updated 4 years ago
- Hadoop Examples☆10Jul 1, 2022Updated 3 years ago
- This project is for demonstrating knowledge of Data Engineering tools and concepts and also learning in the process☆48Dec 1, 2022Updated 3 years ago
- A command line client builder that follows the Canonical's Guidelines for a Command Line Interface.☆15Updated this week
- Public demos using the Cohere platform!☆11May 24, 2023Updated 2 years ago
- Ansible Playbook to create LAMP in CentOS 7 with Apache, MySQL, PHP.☆10Dec 28, 2018Updated 7 years ago
- AlvinToh Learning Repository for The Ultimate Hands-On Hadoop - Tame your Big Data!☆10May 23, 2018Updated 7 years ago
- Little wrapper to query tips from the command line☆13Jun 16, 2022Updated 3 years ago
- Data Engineer Project: An end-to-end Airflow data pipeline with BigQuery, dbt Soda, and more!☆11Dec 14, 2023Updated 2 years ago
- Source code for the module "Advanced Statistics" 📊☆10Feb 25, 2019Updated 7 years ago
- Some experiments on transformer models☆11Feb 9, 2024Updated 2 years ago
- Transactional Machine Learning using Data Streams and AutoML☆14Oct 5, 2025Updated 5 months ago
- An end to end ML project. Using MLflow for experiment tracking and model registry. Prefect for workflow orchestration. S3 for artifacts s…☆12Sep 11, 2022Updated 3 years ago
- Dashboard showcasing Conjoint Analysis for the Electric Vehicle Lease Market (as at January 2020) in San Francisco☆15Feb 19, 2020Updated 6 years ago
- Small data engineering tutorial☆10Oct 24, 2018Updated 7 years ago
- Para entender e aprender um pouco sobre o Apache Kafka.https://www.youtube.com/channel/UC3pevgVzUWKo5CoWdhDsoHw☆13Jan 8, 2026Updated last month
- ☆11Feb 29, 2024Updated 2 years ago
- Discover the perfect harmony of tunes and movies!☆10Aug 17, 2023Updated 2 years ago
- 3NF-normalize Yelp data on S3 with Spark and load it into Redshift - automate the whole thing with Apache Airflow☆12Aug 17, 2019Updated 6 years ago
- ☆17Sep 8, 2025Updated 5 months ago
- Word embeddings learned from 10-K documents☆11Nov 6, 2019Updated 6 years ago