a curated list of awesome lakehouse frameworks, applications, etc
☆43Mar 9, 2026Updated last month
Alternatives and similar repositories for awesome-lakehouse
Users that are interested in awesome-lakehouse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Monitoring and insights on your data lakehouse tables☆32Updated this week
- ☆11Nov 26, 2024Updated last year
- ☆30Dec 4, 2024Updated last year
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆91Apr 10, 2026Updated last week
- ☆13Jun 10, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Olympia is a storage-only open catalog format for big data analytics, ML & AI.☆16May 5, 2025Updated 11 months ago
- A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…☆274Updated this week
- Create, manage and edit your audio book library from the command line.☆10Oct 20, 2024Updated last year
- ☆13Oct 12, 2024Updated last year
- A leightweight UI for Lakekeeper☆16Updated this week
- Iceberg Playground in a Box☆67Apr 8, 2026Updated last week
- Open Control Plane for Tables in Data Lakehouse☆385Updated this week
- A minimalist / functional / dataflow programming language☆13May 1, 2024Updated last year
- ☆20Jun 16, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A simple screenshot, screencast, and file upload tool with S3 support written in rust.☆19Nov 30, 2021Updated 4 years ago
- Fast, zero-copy HTML Parser written in Rust☆27Dec 6, 2025Updated 4 months ago
- The home of Floecat: A catalog of catalogs for open table formats☆60Apr 9, 2026Updated last week
- Lakevision is a tool which provides insights into your Apache Iceberg based Data Lakehouse.☆50Apr 11, 2026Updated last week
- A playground to experience Gravitino☆75Mar 16, 2026Updated last month
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆85Apr 12, 2025Updated last year
- Open, Multi-modal Catalog for Data & AI, written in Rust☆85Sep 30, 2024Updated last year
- Batteries included CLI, TUI, and server implementations for DataFusion.☆194Updated this week
- The observability platform for Iceberg lakehouses.☆452Jan 12, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆16May 21, 2021Updated 4 years ago
- A complete data engineering project demonstrating modern data stack practices with Apache Flink, Iceberg, Trino and Superset☆23Sep 29, 2025Updated 6 months ago
- ☆51Updated this week
- Python Package for ducklake☆20Jun 5, 2025Updated 10 months ago
- Point-in-Time optimizations for Apache Spark☆30Jan 18, 2024Updated 2 years ago
- Computer science fundamentals.☆21Aug 18, 2025Updated 8 months ago
- An open-source framework that simplifies implementation of data solutions.☆146Dec 2, 2025Updated 4 months ago
- A cloud native data mesh implementation☆12Jan 15, 2021Updated 5 years ago
- The Go library for pulsar admin operations, providing a unified Go API for managing pulsar resources such as tenants, namespaces and top…☆14Aug 23, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Demo of fine-tuning QA models for answering FAQ of cloud providers documentation☆11Mar 7, 2023Updated 3 years ago
- Command line debugging console for Cats Effect☆19Apr 2, 2024Updated 2 years ago
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆39Feb 17, 2025Updated last year
- DuckDB Pyroscope Extension for Continuous Profiling☆21Feb 18, 2026Updated 2 months ago
- collection of read materials☆18May 18, 2020Updated 5 years ago
- ☆19Aug 31, 2022Updated 3 years ago
- Use pyarrow with Azure Data Lake gen2☆28Jun 27, 2024Updated last year