fraibacas / lakehouse-pocLinks
Run an open-source data LakeHouse locally using Docker Compose
☆11Updated last year
Alternatives and similar repositories for lakehouse-poc
Users that are interested in lakehouse-poc are comparing it to the libraries listed below
Sorting:
- ☆18Updated 10 months ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated last year
- dlt-dagster-demo☆11Updated last year
- Demonstrating the capabilities of DuckDB as a transformation engine for data lakes☆28Updated 8 months ago
- Automate and streamline the alerting & notification process for dbt test results🐞🚀☆17Updated last month
- Cost Efficient Data Pipelines with DuckDB☆53Updated 3 weeks ago
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆56Updated last week
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆71Updated last year
- ☆16Updated last year
- Discover the simplicity and strength of Duckdb, dbt, and Iceberg in this project. Create an efficient, versatile data analytics solution …☆34Updated last year
- duckdb-etl-framework☆11Updated 5 months ago
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆27Updated last year
- ☆40Updated last week
- Repo for CDC with debezium blog post☆28Updated 8 months ago
- Full stack data engineering tools and infrastructure set-up☆53Updated 4 years ago
- ☆18Updated last year
- Utility functions for dbt projects running on Spark☆34Updated 3 months ago
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆21Updated 2 years ago
- ☆16Updated last year
- Repo for orienting dbt users to the Dagster asset framework☆54Updated 2 years ago
- Building a poor man's data lake: Exploring the Power of Polars and Delta Lake☆10Updated 2 weeks ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lak…☆23Updated last year
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 9 months ago
- ☆10Updated 3 years ago
- ☆34Updated 3 weeks ago
- API for distributing Data Lake Data☆11Updated 2 months ago
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdb☆20Updated last year
- A simple Data Engineering solution for testing or education purposes. You only need to know SQL and Python to understand this project. Da…☆25Updated 2 years ago
- A sample implementation of stream writes to an Iceberg table on GCS using Flink and reading it using Trino☆20Updated 3 years ago