manuzhang / awesome-lakehouseView external linksLinks
a curated list of awesome lakehouse frameworks, applications, etc
☆40Feb 9, 2026Updated last week
Alternatives and similar repositories for awesome-lakehouse
Users that are interested in awesome-lakehouse are comparing it to the libraries listed below
Sorting:
- Monitoring and insights on your data lakehouse tables☆32Jan 28, 2026Updated 2 weeks ago
- ☆11Nov 26, 2024Updated last year
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆90Oct 30, 2025Updated 3 months ago
- A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…☆267Jan 28, 2026Updated 3 weeks ago
- Open Control Plane for Tables in Data Lakehouse☆380Feb 10, 2026Updated last week
- Use pyarrow with Azure Data Lake gen2☆28Jun 27, 2024Updated last year
- TSG Client is a Python library for interacting with the TNO Security Gateway (TSG) Core Container☆18Mar 28, 2025Updated 10 months ago
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆83Apr 12, 2025Updated 10 months ago
- Lakevision is a tool which provides insights into your Apache Iceberg based Data Lakehouse.☆48Jan 15, 2026Updated last month
- Python Package to Share/Edit Pandas/Polars DF with web interface!☆11Jun 10, 2025Updated 8 months ago
- The observability platform for Iceberg lakehouses.☆436Jan 12, 2026Updated last month
- A Spark Connector that reads data from / writes data to Arrow-Flight end-points with Arrow-Flight and Flight-SQL☆46Dec 14, 2025Updated 2 months ago
- How to customize Tableau authentication using the AWS Athena's JDBC Credentials Provider capabilites.☆14Jun 8, 2020Updated 5 years ago
- Crossplane upjet provider for Confluent Cloud: https://registry.terraform.io/providers/confluentinc/confluent/latest/docs☆13Jan 20, 2026Updated 3 weeks ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Jul 31, 2023Updated 2 years ago
- Hadoop/Hive/Spark container to perform CI tests☆10Dec 26, 2020Updated 5 years ago
- This solution helps you deploy ETL processes and data storage resources to create an Insurance Lake using Amazon S3 buckets for storage, …☆16Feb 5, 2026Updated last week
- ☆48Feb 4, 2026Updated last week
- Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.☆1,193Updated this week
- Latest: 7.0.0 - Lightweight and ready-to-use services to easily connect an IDS-Connector to different IDS-Infrastructure-Components.☆14Mar 4, 2024Updated last year
- for Programming Principles book☆12Jul 21, 2025Updated 6 months ago
- The home of Floecat: A catalog of catalogs for open table formats☆41Updated this week
- A Fully HiveServer2-like Multi-tenancy Spark Thrift Server Supporting Impersonation and Multi-SparkContext with Ranger Authorization (GO …☆10Jul 7, 2022Updated 3 years ago
- A cloud native data mesh implementation☆12Jan 15, 2021Updated 5 years ago
- Run an open-source data LakeHouse locally using Docker Compose☆12May 31, 2024Updated last year
- similarity between graph nodes based on local information with PySpark☆10Sep 30, 2022Updated 3 years ago
- Proof Of Concept - Open Patient Pathway Generator using and an agent based approach☆11Apr 4, 2023Updated 2 years ago
- Policy Administration point to handle ODRL policies and provide their Rego-equivalent to the Open Policy Agent☆11Feb 5, 2026Updated last week
- Infra stuff to run Kubernetes on travisci☆10Mar 7, 2023Updated 2 years ago
- ADT support for Flink with Shapeless☆12Jan 11, 2020Updated 6 years ago
- Bin Packing Algorithms implemented in Python☆11Feb 16, 2014Updated 12 years ago
- A benchmark tool for lakehouses.☆14Mar 12, 2023Updated 2 years ago
- FMI for Power System☆10Sep 6, 2019Updated 6 years ago
- ☆16Updated this week
- Pacote para adicionar dias úteis a uma data de referência ou verificar se determinada data é dia útil ou não e permite capturar uma lista…☆10Dec 8, 2025Updated 2 months ago
- Demo application on how to create a serverless realtime analytics application using Kinesis Data Streams, Kinesis Firehose, DynamoDB and …☆14Dec 4, 2020Updated 5 years ago
- reclaim your stuff from social media silos☆47Jan 13, 2015Updated 11 years ago
- Atlassian Bamboo and Bitbucket images for GKE clusters☆10Mar 24, 2022Updated 3 years ago
- Provides time series data and metadata as Apache Arrow.☆16Updated this week