a curated list of awesome lakehouse frameworks, applications, etc
☆45Mar 9, 2026Updated 2 months ago
Alternatives and similar repositories for awesome-lakehouse
Users that are interested in awesome-lakehouse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Monitoring and insights on your data lakehouse tables☆32May 22, 2026Updated last week
- ☆11Nov 26, 2024Updated last year
- ☆30Dec 4, 2024Updated last year
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆91Apr 10, 2026Updated last month
- Apache Hive Metastore in Standalone Mode With Docker☆14Jul 22, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Olympia is a storage-only open catalog format for big data analytics, ML & AI.☆16May 5, 2025Updated last year
- A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…☆276Apr 17, 2026Updated last month
- "Nature's economy shall be the base for our own, for it is immutable, but ours is secondary. An economist without knowledge of nature is …☆20May 31, 2021Updated 4 years ago
- A leightweight UI for Lakekeeper☆17Updated this week
- Iceberg Playground in a Box☆69Apr 8, 2026Updated last month
- ☆20Jun 16, 2020Updated 5 years ago
- A Table format agnostic data sharing framework☆42Feb 4, 2024Updated 2 years ago
- ☆14May 28, 2020Updated 6 years ago
- Altinity Datasets for ClickHouse☆19Feb 20, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB worker…☆18Dec 9, 2023Updated 2 years ago
- Fast, zero-copy HTML Parser written in Rust☆28Dec 6, 2025Updated 5 months ago
- A simple screenshot, screencast, and file upload tool with S3 support written in rust.☆19Nov 30, 2021Updated 4 years ago
- Rust Client library for Apache Pulsar☆14May 27, 2022Updated 4 years ago
- The home of Floecat: A catalog of catalogs for open table formats☆81Updated this week
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆85Apr 12, 2025Updated last year
- Open, Multi-modal Catalog for Data & AI, written in Rust☆84Sep 30, 2024Updated last year
- The observability platform for Iceberg lakehouses.☆461Jan 12, 2026Updated 4 months ago
- A complete data engineering project demonstrating modern data stack practices with Apache Flink, Iceberg, Trino and Superset☆25Sep 29, 2025Updated 8 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Python Package for ducklake☆20Jun 5, 2025Updated 11 months ago
- Point-in-Time optimizations for Apache Spark☆30Jan 18, 2024Updated 2 years ago
- A DSL for scalacOptions☆17May 13, 2026Updated 2 weeks ago
- A cloud native data mesh implementation☆12Jan 15, 2021Updated 5 years ago
- The Go library for pulsar admin operations, providing a unified Go API for managing pulsar resources such as tenants, namespaces and top…☆14Aug 23, 2023Updated 2 years ago
- Demo of fine-tuning QA models for answering FAQ of cloud providers documentation☆11Mar 7, 2023Updated 3 years ago
- DuckDB Pyroscope Extension for Continuous Profiling☆21Feb 18, 2026Updated 3 months ago
- ☆23Sep 7, 2023Updated 2 years ago
- MCP server for Apache Iceberg☆34Nov 17, 2025Updated 6 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆19Aug 31, 2022Updated 3 years ago
- Use pyarrow with Azure Data Lake gen2☆28Jun 27, 2024Updated last year
- Lakehouse storage system benchmark☆81Feb 22, 2023Updated 3 years ago
- ZIO — A principled, powerful, standalone effect data type for any Scala project.☆13Mar 28, 2025Updated last year
- This project is a template for ingesting real-time event streams from Wikipedia to be queried in Apache Pinot☆23May 20, 2022Updated 4 years ago
- PyIceberg☆1,063Updated this week
- Scalafix rules for Typelevel projects☆27May 20, 2026Updated last week