A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api
☆23Nov 19, 2024Updated last year
Alternatives and similar repositories for Youtube-Recommend-Master-ETL-Pipeline
Users that are interested in Youtube-Recommend-Master-ETL-Pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 8 months ago
- NoSQL extract, transform, load (ETL) toolkit with Python☆15Apr 4, 2026Updated 2 weeks ago
- ☆26Jan 21, 2026Updated 2 months ago
- Source Code for 'Beginning Blockchain' by Bikramaditya Singhal, Gautam Dhameja, and Priyansu Sekhar Panda☆10May 17, 2024Updated last year
- Scan and monitor your network effortlessly! Nmap Prometheus Exporter provides insights into network health and security with Prometheus-c…☆15Oct 2, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 2 years ago
- My Setup Development Environment as Data Engineer☆37Aug 5, 2025Updated 8 months ago
- 🌟 An end-to-end full-stack data science project, including modelling, MLOps, and data storytelling. ✨☆16Aug 30, 2025Updated 7 months ago
- My notes from the @makersacademy course.☆23Apr 10, 2015Updated 11 years ago
- A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apa…☆29Jun 7, 2023Updated 2 years ago
- Source code for 'Pro Power BI Desktop' by Adam Aspin☆13Mar 28, 2017Updated 9 years ago
- Cool DE Projects☆69Mar 22, 2026Updated 3 weeks ago
- Fivetran's Jira source dbt package☆14Oct 1, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- An example of a Dagster project with a possible folder structure to organize the assets, jobs, repositories, schedules, and ops. Also has…☆101Nov 3, 2024Updated last year
- StarCraft 2 Data Pipeline with Airflow, DuckDB and Streamlit☆16Mar 14, 2024Updated 2 years ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Apr 29, 2021Updated 4 years ago
- ELT for AEMET weather data.☆16Mar 23, 2025Updated last year
- ☆11Nov 18, 2022Updated 3 years ago
- This extension makes vscode seamlessly work with dbt and bigquery☆15Sep 27, 2022Updated 3 years ago
- SQL Server 2017 Integration Services Cookbook, published by Packt☆17Jan 30, 2023Updated 3 years ago
- Docktor is a Web App that deploys an easy-to-use kit of analysis and scanning tools.☆13Nov 1, 2023Updated 2 years ago
- ☆15Mar 15, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Source code for 'Pro Power BI Desktop' by Adam Aspin☆22Dec 4, 2017Updated 8 years ago
- Source code for 'Power Query for Power BI and Excel' by Christopher Webb and Crossjoin Consulting Limited☆19Aug 18, 2017Updated 8 years ago
- Pipeline that extracts data from the Spotify API to build a more detailed version of Spotify Wrapped☆49Mar 13, 2026Updated last month
- Repo for learning DBT with Snowflake, featuring projects and models for data transformation and automation☆26Mar 31, 2025Updated last year
- Fivetran's social media reporting dbt package. Combine your Facebook Pages, Instagram Business, Twitter Organic, and LinkedIn Pages socia…☆25Mar 2, 2026Updated last month
- ☆11Dec 28, 2020Updated 5 years ago
- ☆13May 1, 2024Updated last year
- Data Engineering with AWS Cookbook, published by Packt☆24Updated this week
- A fully serverless, event-driven data pipeline that ingests, enriches, validates, and visualizes real-time news data using AWS services. …☆25Aug 10, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Analytics engineering with dbt - projects and developer environment☆22Sep 27, 2024Updated last year
- This repository contains a Docker Compose configuration for running ScyllaDB, a highly scalable NoSQL database for learning and testing.☆14Sep 19, 2024Updated last year
- Code to demonstrate data engineering metadata & logging best practices☆21Mar 12, 2024Updated 2 years ago
- Gem mysql2 agent for huginn☆14Jun 5, 2017Updated 8 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆66Sep 23, 2023Updated 2 years ago
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆24Apr 27, 2023Updated 2 years ago
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆77Sep 2, 2023Updated 2 years ago