A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api
☆23Nov 19, 2024Updated last year
Alternatives and similar repositories for Youtube-Recommend-Master-ETL-Pipeline
Users that are interested in Youtube-Recommend-Master-ETL-Pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This project implements an ELT (Extract - Load - Transform) data pipeline with the goodreads dataset, using dagster (orchestration), spar…☆43Apr 22, 2023Updated 3 years ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 8 months ago
- End-to-end ELT data engineering project☆23Dec 24, 2022Updated 3 years ago
- ☆30Feb 11, 2024Updated 2 years ago
- Scan and monitor your network effortlessly! Nmap Prometheus Exporter provides insights into network health and security with Prometheus-c…☆15Oct 2, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A simple demo showing how to use Ably and fastAPI to route messages into Kafka for stream processing☆16Oct 12, 2021Updated 4 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 2 years ago
- Data Guy Story commandline☆11Dec 2, 2022Updated 3 years ago
- My Setup Development Environment as Data Engineer☆37Aug 5, 2025Updated 9 months ago
- 🌟 An end-to-end full-stack data science project, including modelling, MLOps, and data storytelling. ✨☆16Aug 30, 2025Updated 8 months ago
- An end-2-end project about Son Tung M-TP☆27Sep 25, 2025Updated 7 months ago
- My notes from the @makersacademy course.☆23Apr 10, 2015Updated 11 years ago
- A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apa…☆29Jun 7, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Source code for 'Pro Power BI Desktop' by Adam Aspin☆13Mar 28, 2017Updated 9 years ago
- Cool DE Projects☆73Mar 22, 2026Updated last month
- Fivetran's Jira source dbt package☆14Oct 1, 2025Updated 7 months ago
- An example of a Dagster project with a possible folder structure to organize the assets, jobs, repositories, schedules, and ops. Also has…☆102Nov 3, 2024Updated last year
- 🚀 Complete AWS learning path for beginners - 45K+ community resource with hands-on labs, workshops, and certification guides☆18Apr 28, 2026Updated last week
- StarCraft 2 Data Pipeline with Airflow, DuckDB and Streamlit☆16Mar 14, 2024Updated 2 years ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Apr 29, 2021Updated 5 years ago
- In this project I have built etl pipline which scraps the trending repository based on month,week and day LIVE extract other related info…☆12Sep 9, 2023Updated 2 years ago
- ☆11Nov 18, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- API/Data Platform for Ingesting, Storing, and Serving Data through Postgres, and Litestar☆11Apr 25, 2026Updated 2 weeks ago
- This provider contains operators, decorators and triggers to send a ray job from an airflow task☆25Oct 27, 2025Updated 6 months ago
- SQL Server 2017 Integration Services Cookbook, published by Packt☆17Jan 30, 2023Updated 3 years ago
- Docktor is a Web App that deploys an easy-to-use kit of analysis and scanning tools.☆13Nov 1, 2023Updated 2 years ago
- ☆15Mar 15, 2024Updated 2 years ago
- End to end data pipeline to extract and analyze submissions from any subreddit using Pushshift, python, dbt and BigQuery.☆12Jul 17, 2023Updated 2 years ago
- A Python Snowpark CLI for loading the TPC-DI dataset into Snowflake. Additional dbt models for building the data warehouse.☆11Sep 4, 2025Updated 8 months ago
- Source code for 'Pro Power BI Desktop' by Adam Aspin☆22Dec 4, 2017Updated 8 years ago
- Source code for 'Power Query for Power BI and Excel' by Christopher Webb and Crossjoin Consulting Limited☆19Aug 18, 2017Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A dataset for Movie Recommendation with NebulaGraph, ETL to merge two dataset: OMDB & Movielens with dbt, postgres and Nebula-Importer☆17Sep 7, 2024Updated last year
- Pipeline that extracts data from the Spotify API to build a more detailed version of Spotify Wrapped☆49Mar 13, 2026Updated last month
- Repo for learning DBT with Snowflake, featuring projects and models for data transformation and automation☆26Mar 31, 2025Updated last year
- Skooldio: Data Pipelines with Airflow☆23May 24, 2025Updated 11 months ago
- Fivetran's social media reporting dbt package. Combine your Facebook Pages, Instagram Business, Twitter Organic, and LinkedIn Pages socia…☆25Mar 2, 2026Updated 2 months ago
- ☆11Dec 28, 2020Updated 5 years ago
- ☆14May 1, 2024Updated 2 years ago