a curated list of awesome lakehouse frameworks, applications, etc
☆45Mar 9, 2026Updated 2 months ago
Alternatives and similar repositories for awesome-lakehouse
Users that are interested in awesome-lakehouse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Monitoring and insights on your data lakehouse tables☆32Apr 28, 2026Updated last week
- ☆30Dec 4, 2024Updated last year
- ☆13Jun 10, 2024Updated last year
- The Data Landing Zone is a CDK Construct designed to create a landing zone tailored for supporting and enabling AI, data-driven, data mes…☆23Updated this week
- Apache Hive Metastore in Standalone Mode With Docker☆14Jul 22, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Olympia is a storage-only open catalog format for big data analytics, ML & AI.☆16May 5, 2025Updated last year
- A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…☆276Apr 17, 2026Updated 3 weeks ago
- "Nature's economy shall be the base for our own, for it is immutable, but ours is secondary. An economist without knowledge of nature is …☆20May 31, 2021Updated 4 years ago
- ☆13Oct 12, 2024Updated last year
- A leightweight UI for Lakekeeper☆16Updated this week
- Iceberg Playground in a Box☆67Apr 8, 2026Updated last month
- Open Control Plane for Tables in Data Lakehouse☆387May 2, 2026Updated last week
- Start debugger listener on a running Node.js process☆12Oct 24, 2019Updated 6 years ago
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB worker…☆18Dec 9, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The home of Floecat: A catalog of catalogs for open table formats☆78May 2, 2026Updated last week
- Lakevision is a tool which provides insights into your Apache Iceberg based Data Lakehouse.☆51Apr 11, 2026Updated 3 weeks ago
- A playground to experience Gravitino☆77Mar 16, 2026Updated last month
- A cli for spinning up and managing Ray clusters for the Daft Query Engine.☆15Feb 15, 2025Updated last year
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆85Apr 12, 2025Updated last year
- Open, Multi-modal Catalog for Data & AI, written in Rust☆84Sep 30, 2024Updated last year
- Batteries included CLI, TUI, and server implementations for DataFusion.☆194Apr 14, 2026Updated 3 weeks ago
- The observability platform for Iceberg lakehouses.☆459Jan 12, 2026Updated 3 months ago
- Python Package for ducklake☆20Jun 5, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆51Apr 28, 2026Updated last week
- Point-in-Time optimizations for Apache Spark☆30Jan 18, 2024Updated 2 years ago
- An open-source framework that simplifies implementation of data solutions.☆146Dec 2, 2025Updated 5 months ago
- A cloud native data mesh implementation☆12Jan 15, 2021Updated 5 years ago
- The Go library for pulsar admin operations, providing a unified Go API for managing pulsar resources such as tenants, namespaces and top…☆14Aug 23, 2023Updated 2 years ago
- Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.☆1,290Updated this week
- ☆68May 9, 2025Updated last year
- Command line debugging console for Cats Effect☆19Apr 2, 2024Updated 2 years ago
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆39Feb 17, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Unity Catalog Explorer is a TypeScript + Next.js based Web UI for the Unity Catalog OSS.☆13Jun 29, 2024Updated last year
- MCP server for Apache Iceberg☆34Nov 17, 2025Updated 5 months ago
- collection of read materials☆18May 18, 2020Updated 5 years ago
- ☆19Aug 31, 2022Updated 3 years ago
- Use pyarrow with Azure Data Lake gen2☆28Jun 27, 2024Updated last year
- This project demonstrates Real-Time streaming of CDC data from MySql to Apache Iceberg using Flink SQL Client for faster data analytics a…☆24Jan 16, 2024Updated 2 years ago
- Lakehouse storage system benchmark☆80Feb 22, 2023Updated 3 years ago