minio / openlakeLinks
Build Data Lake using Open Source tools
☆100Updated last week
Alternatives and similar repositories for openlake
Users that are interested in openlake are comparing it to the libraries listed below
Sorting:
- Docker envinroment to stream data from Kafka to Iceberg tables☆29Updated last year
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆188Updated last week
- Apache Hive Metastore as a Standalone server in Docker☆76Updated 9 months ago
- Open Control Plane for Tables in Data Lakehouse☆353Updated this week
- Minimal example to run Trino, Minio, and Hive standalone metastore on docker☆52Updated 3 years ago
- Sparglim✨ makes PySpark App Configurable and Deploy Spark Connect Server Easier!☆37Updated 3 months ago
- How to use Presto (with Hive metastore) and MinIO?☆27Updated 2 years ago
- Operator for Apache Spark-on-Kubernetes for Stackable Data Platform☆63Updated this week
- Apache Flink (Pyflink) and Related Projects☆39Updated last month
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆27Updated last year
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆58Updated last year
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆98Updated 2 years ago
- ☆264Updated 7 months ago
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆259Updated this week
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆237Updated this week
- New generation opensource data stack☆68Updated 3 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆149Updated this week
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆35Updated last year
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆75Updated 3 years ago
- Stackable Operator for Apache Airflow☆28Updated this week
- Open source stack lakehouse☆25Updated last year
- Helm charts for Trino and Trino Gateway☆167Updated last week
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture☆74Updated last month
- A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB and Superset☆230Updated 3 months ago
- ☆54Updated last week
- ☆204Updated last week
- A Micosoft Power BI Custom Connector allowing you to import Trino data into Power BI.☆68Updated 4 months ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆251Updated 4 months ago
- Collection of assets used for various articles at https://blogs.min.io☆38Updated 2 months ago
- The platform that powers Airbyte. Please file issues in https://github.com/airbytehq/airbyte☆251Updated this week