To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a music streaming platform, let’s delve into the detailed workflow and benefits of each component.
☆45Mar 7, 2024Updated 2 years ago
Alternatives and similar repositories for Iceberg-Dbt-Trino-Hive-modern-open-source-data-stack
Users that are interested in Iceberg-Dbt-Trino-Hive-modern-open-source-data-stack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- In this article, you will learn how to set up a real-time data processing and analytics environment using Docker, MySQL, Redpanda, MinIO,…☆11Jun 27, 2023Updated 2 years ago
- End-to-end ELT pipeline for 160K+ Skytrax airline reviews: Airflow orchestration, BeautifulSoup scraping, S3 staging, Snowflake wareho…☆12Mar 16, 2026Updated last week
- ☆13Oct 4, 2023Updated 2 years ago
- DevOpsDays Taipei 2025 Observability Bootcamp - Observability Platform 101☆19Jun 9, 2025Updated 9 months ago
- Testing some ideas in the Python playground.☆11Sep 23, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆38Apr 25, 2024Updated last year
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆39Dec 15, 2025Updated 3 months ago
- This project implements a Lakehouse Medallion Architecture using modern Data Stack tools such as Fivetran, Snowflake and dbt. The fictici…☆14Sep 30, 2024Updated last year
- A DataOps framework for building a lakehouse.☆56Updated this week
- An Open Source Based Reference Analytics Platform for a Retail Company☆24Sep 11, 2024Updated last year
- An LLM-powered chatbot with the added context of the dbt knowledge base.☆39Dec 4, 2024Updated last year
- Build Data Lake using Open Source tools☆124May 27, 2025Updated 10 months ago
- Transporter for integrating OpenLineage with OpenMetadata☆17Sep 10, 2025Updated 6 months ago
- A data generator for Apache Druid☆12Mar 26, 2025Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A course in data warehouse☆19Sep 27, 2025Updated 6 months ago
- Build a data pipeline with Apache Airflow☆11May 7, 2021Updated 4 years ago
- Playbook to provision a Confluent Cluster☆10Oct 22, 2017Updated 8 years ago
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆15Dec 2, 2020Updated 5 years ago
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28May 19, 2025Updated 10 months ago
- This repo contains DAGs demonstrating a variety of ELT patterns using Airflow along with dbt.☆12Jan 12, 2023Updated 3 years ago
- ☆19Mar 24, 2025Updated last year
- Metabase Teradata Driver shipped as 3rd party plugin☆11Dec 1, 2025Updated 3 months ago
- Glue VSCode devcontainer setup☆14Jan 31, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆10Jul 21, 2022Updated 3 years ago
- Objects and Animals detection with Wifi camera and Yolo☆16Apr 28, 2024Updated last year
- ⚠️⚠️⚠️ DEPRECATED☆14Nov 18, 2018Updated 7 years ago
- Python tool to help export Azure DevOps WIKI into a single PDF☆10May 10, 2020Updated 5 years ago
- ☆23Sep 5, 2022Updated 3 years ago
- ☆11Mar 7, 2021Updated 5 years ago
- Guideline to extract table lineage info in OpenLineage format from access history view☆14May 11, 2023Updated 2 years ago
- Terraform AWS free tier, EC2/ECR/RDS/EFS/DynamoDB/Lambda/S3. Docker running on EC2, Traefik reverse proxy, Lets Encrypt, dynamic DNS, Zer…☆38Jun 19, 2024Updated last year
- Automated basic infrastructure to intall OKD4 on free ESXi☆13Aug 8, 2020Updated 5 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- 一個使用Streamlit框架和GPT3.5 turbo模型官方API,快速建置Web app於平台Render。☆12Apr 3, 2023Updated 2 years ago
- Efficient Data Processing in Distributed Database (Vietnamese docs)☆31Jul 2, 2025Updated 8 months ago
- repo do Diego☆10Nov 7, 2023Updated 2 years ago
- A scribd-downloader that actually works☆23Aug 17, 2017Updated 8 years ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆15Jun 26, 2023Updated 2 years ago
- Exemplos do livro Estatística para Data Science, da Casa do Código☆12Feb 18, 2025Updated last year
- Example Code to Supplement the Label Studio Blog☆33Jan 6, 2026Updated 2 months ago