linkedin / openhouse
Open Control Plane for Tables in Data Lakehouse
☆345Updated this week
Alternatives and similar repositories for openhouse:
Users that are interested in openhouse are comparing it to the libraries listed below
- ☆261Updated this week
- Turning PySpark Into a Universal DataFrame API☆390Updated this week
- Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.☆608Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,192Updated this week
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆251Updated this week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆328Updated 2 years ago
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆77Updated 3 weeks ago
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆397Updated 2 weeks ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆252Updated last year
- Apache PyIceberg☆706Updated this week
- Apache DataFusion Comet Spark Accelerator☆939Updated this week
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆233Updated last month
- Work with your web service, database, and streaming schemas in a single format.☆344Updated this week
- Multi-hop declarative data pipelines☆115Updated this week
- A Python Library to support running data quality rules while the spark job is running⚡☆186Updated this week
- Performance Observability for Apache Spark☆248Updated last month
- An open protocol for secure data sharing☆831Updated this week
- A Table format agnostic data sharing framework☆38Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆215Updated last week
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,468Updated this week
- New file format for storage of large columnar datasets.☆538Updated this week
- QTag: Turbocharge Your SQL Comments☆13Updated 3 months ago
- 🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊☆731Updated this week
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆206Updated last week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆161Updated 5 months ago
- The Open-Source Enterprise Data Platform in a single Portal☆238Updated last week
- The bridge to effortless multi-engine data applications, currently supports Snowflake ❄️ and DuckDB 🦆☆179Updated last week
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆114Updated 2 months ago
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆837Updated 2 months ago
- ☆193Updated last week