Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture
☆146Jan 21, 2026Updated last month
Alternatives and similar repositories for awesome-lakehouse-guide
Users that are interested in awesome-lakehouse-guide are comparing it to the libraries listed below
Sorting:
- A repository of blogs/videos that presents how Apache Iceberg is being used in Production by various orgs☆18Jul 31, 2023Updated 2 years ago
- "Nature's economy shall be the base for our own, for it is immutable, but ours is secondary. An economist without knowledge of nature is …☆20May 31, 2021Updated 4 years ago
- SIEM, Visibility, and Event-Driven Architecture Curated Solutions. Build a cost-effective threat detection and log management system.☆18Jan 17, 2024Updated 2 years ago
- Iceberg Playground in a Box☆67Jun 27, 2025Updated 8 months ago
- Docker envinroment to stream data from Kafka to Iceberg tables☆30Feb 27, 2024Updated 2 years ago
- Monitoring and insights on your data lakehouse tables☆32Updated this week
- Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processin…☆1,164Feb 23, 2026Updated 2 weeks ago
- Deploy a complete data stack in just a couple of minutes.☆15Mar 6, 2024Updated 2 years ago
- ☆14Oct 10, 2025Updated 5 months ago
- Local Environment to Practice Data Engineering☆144Dec 30, 2024Updated last year
- ☆15Mar 27, 2023Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated this week
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated last year
- Mock streaming data generator☆17May 31, 2024Updated last year
- ☆41Jul 4, 2022Updated 3 years ago
- ☆22Feb 5, 2024Updated 2 years ago
- The observability platform for Iceberg lakehouses.☆443Jan 12, 2026Updated last month
- Practical Data Engineering: A Hands-On Real-Estate Project Guide☆783Sep 3, 2024Updated last year
- utility for creating log files, designed to help test Fluentd configuration files☆20Oct 12, 2023Updated 2 years ago
- Some recipes for data engineering with Python☆25Mar 23, 2021Updated 4 years ago
- Cost Efficient Data Pipelines with DuckDB☆63May 14, 2025Updated 9 months ago
- Delta Lake examples☆240Oct 8, 2024Updated last year
- Apache DataFusion Comet Spark Accelerator☆1,150Updated this week
- Open source stack lakehouse☆25Mar 2, 2024Updated 2 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆25Mar 3, 2024Updated 2 years ago
- Don't Panic. This guide will help you when it feels like the end of the world.☆30Feb 7, 2026Updated last month
- Operator for Apache Spark-on-Kubernetes for Stackable Data Platform☆69Updated this week
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆31Apr 13, 2023Updated 2 years ago
- In-browser data analysis using SQL | Powered by duckdb-wasm☆26Dec 21, 2025Updated 2 months ago
- AI agent debugging, collaboration, and trace observability. Built for teams using CrewAI, OpenAI, and more.☆14Updated this week
- ☆11Jun 12, 2023Updated 2 years ago
- Open, Multi-modal Catalog for Data & AI☆3,327Updated this week
- ☆37Mar 2, 2026Updated last week
- The smallest DuckDB SQL orchestrator on Earth.☆337Nov 22, 2025Updated 3 months ago
- Data Mesh Architecture☆84Oct 15, 2025Updated 4 months ago
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,866Updated this week
- Beyond Vibe Coding. Code, Planning, Documentation and Product Management agents.☆70Feb 20, 2026Updated 2 weeks ago
- Ghi chép về snort, suricata, SIEM, OSSEC ...☆11Dec 4, 2018Updated 7 years ago
- ☆10Jun 7, 2025Updated 9 months ago