Build Data Lake using Open Source tools
☆129May 27, 2025Updated last year
Alternatives and similar repositories for openlake
Users that are interested in openlake are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Collection of assets used for various articles at https://blogs.min.io☆43Apr 9, 2026Updated 2 months ago
- ☆16Mar 9, 2026Updated 3 months ago
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆45Mar 7, 2024Updated 2 years ago
- How to use Presto (with Hive metastore) and MinIO?☆28Mar 8, 2023Updated 3 years ago
- A thumbnail generator example using Minio's listenBucketNotification API☆104Jun 2, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Miscellaneous codes and writings for MLOps☆15Apr 8, 2026Updated 2 months ago
- Distributed HTTP Speed Test.☆61Oct 1, 2025Updated 8 months ago
- ☆19Oct 22, 2025Updated 7 months ago
- Helper for handling PySpark DataFrame partition size 📑🎛️☆12Mar 8, 2024Updated 2 years ago
- ☆23Jun 30, 2024Updated last year
- Drive performance measurement tool☆77Dec 29, 2025Updated 5 months ago
- Trino On K8S Via Helm & Metastore Workshop Querying Delta Tables☆12Jan 27, 2025Updated last year
- How to customize Tableau authentication using the AWS Athena's JDBC Credentials Provider capabilites.☆14Jun 8, 2020Updated 6 years ago
- Docker envinroment to stream data from Kafka to Iceberg tables☆30Feb 27, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆319Jun 3, 2026Updated last week
- trino monitoring with JMX metrics through Prometheus and Grafana☆17Aug 14, 2024Updated last year
- Apache Polaris Tools, additional tooling for Apache Polaris☆28Updated this week
- A wrapper script to make running Molecule easier☆13Apr 30, 2022Updated 4 years ago
- Terraform module to manage Compute Instance resources within the Yandex.Cloud.☆13Jun 2, 2026Updated last week
- Delta Lake Examples☆11Apr 24, 2020Updated 6 years ago
- Ansible Role - Containers☆14Jun 15, 2022Updated 3 years ago
- A collection of Data Engineering projects using different cloud providers. Explore real-world implementations of data pipelines, transfor…☆16Apr 7, 2025Updated last year
- Simple code for running and visualizing replicator dynamics☆11Jan 31, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Gadsme Helm chart repository☆57Nov 14, 2025Updated 6 months ago
- Determining the important factors that influences the customer or passenger satisfaction of an airlines using CRISP-DM methodology in Pyt…☆27Sep 1, 2023Updated 2 years ago
- Ansible Collection for GitLab☆18Updated this week
- Set of Go tools to check different elements of your stack (SSL, SMTP, Permissions...)☆25May 27, 2026Updated 2 weeks ago
- ☆14May 17, 2025Updated last year
- Terraforms examples with terraform-provider-libvirt☆16Sep 14, 2020Updated 5 years ago
- Multi-tenant VS Code installation built with Zero-to-Jupyterhub☆46Nov 28, 2020Updated 5 years ago
- Metabase Teradata Driver shipped as 3rd party plugin☆14May 28, 2026Updated 2 weeks ago
- ☆16Jan 20, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Semantic versioning tool for git based on conventional commits☆23May 25, 2026Updated 2 weeks ago
- Kubernetes Operator for Apache HBase built by Stackable for the Stackable Data Platform☆20Updated this week
- Kubeflow Pipelines Event Handler☆12Sep 16, 2021Updated 4 years ago
- Apache Pulsar - distributed pub-sub messaging system☆13Updated this week
- Kubernetes/OpenShift operator for Debezium Server. Please log issues at https://github.com/debezium/dbz/issues.☆84Jun 2, 2026Updated last week
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, duckdb and Superset☆49Apr 5, 2026Updated 2 months ago
- InCoder is a powerful plugin designed for JetBrains IDEs, including IntelliJ IDEA, PyCharm, and others in the JetBrains ecosystem. It sea…☆16Jan 7, 2026Updated 5 months ago