To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a music streaming platform, let’s delve into the detailed workflow and benefits of each component.
☆45Mar 7, 2024Updated 2 years ago
Alternatives and similar repositories for Iceberg-Dbt-Trino-Hive-modern-open-source-data-stack
Users that are interested in Iceberg-Dbt-Trino-Hive-modern-open-source-data-stack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- In this article, you will learn how to set up a real-time data processing and analytics environment using Docker, MySQL, Redpanda, MinIO,…☆11Jun 27, 2023Updated 2 years ago
- FIO Load Testing framework - for Openshift☆11Jun 8, 2023Updated 3 years ago
- On-premises ELT Pipeline☆32Jul 10, 2025Updated 11 months ago
- ☆13Oct 4, 2023Updated 2 years ago
- DevOpsDays Taipei 2025 Observability Bootcamp - Observability Platform 101☆18Jun 9, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Image building contents for running Spark standalone on Kubernetes☆16Apr 10, 2020Updated 6 years ago
- ☆19Oct 22, 2025Updated 7 months ago
- Building a Data Pipeline with an Open Source Stack☆59Jun 27, 2025Updated 11 months ago
- ☆39Apr 25, 2024Updated 2 years ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆84Mar 9, 2026Updated 3 months ago
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆41Dec 15, 2025Updated 6 months ago
- Mutable strings in Golang via overlays (out-of-place implementation)☆14Apr 13, 2023Updated 3 years ago
- An Open Source Based Reference Analytics Platform for a Retail Company☆24Sep 11, 2024Updated last year
- A DataOps framework for building a lakehouse.☆57Jun 5, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An end-to-end open-source data stack for crawling and visualizing real estate data, facilitating insights into market trends.☆15Mar 16, 2026Updated 3 months ago
- An LLM-powered chatbot with the added context of the dbt knowledge base.☆39Dec 4, 2024Updated last year
- A research project on running ML inferencing on KNative☆18Feb 10, 2026Updated 4 months ago
- Build Data Lake using Open Source tools☆129Jun 8, 2026Updated last week
- Tooling to build a custom Confluent Platform Kafka Connect container with additional connectors from Confluent Hub.☆15Oct 26, 2020Updated 5 years ago
- A collection of Data Engineering projects using different cloud providers. Explore real-world implementations of data pipelines, transfor…☆16Apr 7, 2025Updated last year
- A data generator for Apache Druid☆12Mar 26, 2025Updated last year
- A course in data warehouse☆21Sep 27, 2025Updated 8 months ago
- Asynchronous file handers for Python's logging☆15Jul 22, 2017Updated 8 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, duckdb and Superset☆49Apr 5, 2026Updated 2 months ago
- Build a data pipeline with Apache Airflow☆11May 7, 2021Updated 5 years ago
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆15Dec 2, 2020Updated 5 years ago
- ☆24Mar 21, 2025Updated last year
- Reads a HBase table and writes the out as Text, Seq, Avro, or Parquet☆28May 15, 2014Updated 12 years ago
- This repo contains DAGs demonstrating a variety of ELT patterns using Airflow along with dbt.☆12Jan 12, 2023Updated 3 years ago
- Amazon EMR Serverless and Amazon MSK Serverless Demo☆13Jul 31, 2022Updated 3 years ago
- Glue VSCode devcontainer setup☆14Jan 31, 2023Updated 3 years ago
- ☆21Nov 21, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆10Jul 21, 2022Updated 3 years ago
- ☆23Sep 5, 2022Updated 3 years ago
- Metabase Teradata Driver shipped as 3rd party plugin☆14May 28, 2026Updated 3 weeks ago
- Terraform AWS free tier, EC2/ECR/RDS/EFS/DynamoDB/Lambda/S3. Docker running on EC2, Traefik reverse proxy, Lets Encrypt, dynamic DNS, Zer…☆37Jun 19, 2024Updated last year
- Automated basic infrastructure to intall OKD4 on free ESXi☆13Aug 8, 2020Updated 5 years ago
- repo do Diego☆10Nov 7, 2023Updated 2 years ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆15Jun 26, 2023Updated 2 years ago