minio / openlakeLinks
Build Data Lake using Open Source tools
☆104Updated last month
Alternatives and similar repositories for openlake
Users that are interested in openlake are comparing it to the libraries listed below
Sorting:
- Docker envinroment to stream data from Kafka to Iceberg tables☆29Updated last year
- ODD Specification is a universal open standard for collecting metadata.☆142Updated 8 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆154Updated last week
- Apache Hive Metastore as a Standalone server in Docker☆79Updated 10 months ago
- Operator for Apache Spark-on-Kubernetes for Stackable Data Platform☆64Updated this week
- Minimal example to run Trino, Minio, and Hive standalone metastore on docker☆52Updated 3 years ago
- Collection of assets used for various articles at https://blogs.min.io☆37Updated 4 months ago
- Open Control Plane for Tables in Data Lakehouse☆359Updated this week
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆191Updated this week
- Sparglim✨ makes PySpark App Configurable and Deploy Spark Connect Server Easier!☆37Updated 4 months ago
- The NiFiKop NiFi Kubernetes operator makes it easy to run Apache NiFi on Kubernetes. Apache NiFI is a free, open-source solution that sup…☆176Updated last month
- Dremio Container Tools☆162Updated 2 months ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆75Updated this week
- Stackable Operator for Apache Airflow☆28Updated this week
- lakefs-samples repository☆83Updated 2 weeks ago
- A Micosoft Power BI Custom Connector allowing you to import Trino data into Power BI.☆72Updated 6 months ago
- The platform that powers Airbyte. Please file issues in https://github.com/airbytehq/airbyte☆257Updated this week
- The Open-Source Enterprise Data Platform in a single Portal☆248Updated last week
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆99Updated 2 years ago
- Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team …☆121Updated last week
- ☆40Updated 2 years ago
- A kubernetes operator for Apache NiFi☆37Updated last week
- Streamline Apache Kafka with Conduktor Platform. 🚀☆140Updated 2 weeks ago
- Helm charts for Trino and Trino Gateway☆171Updated last week
- ☆266Updated 8 months ago
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆75Updated 3 years ago
- Apache Flink (Pyflink) and Related Projects☆40Updated 3 months ago
- Auto-generated Diagrams from Airflow DAGs. 🔮 🪄☆344Updated last week
- A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lak…☆24Updated last year
- A curated list of dagster code snippets for data engineers☆56Updated last year