linkedin / openhouse
Open Control Plane for Tables in Data Lakehouse
☆306Updated this week
Related projects ⓘ
Alternatives and complementary repositories for openhouse
- Turning PySpark Into a Universal DataFrame API☆317Updated this week
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆213Updated this week
- Schema modelling framework for decentralised domain-driven ownership of data.☆247Updated 11 months ago
- Apache PyIceberg☆461Updated this week
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆366Updated last week
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,129Updated this week
- Performance Observability for Apache Spark☆192Updated this week
- Apache DataFusion Comet Spark Accelerator☆816Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,031Updated this week
- An open protocol for secure data sharing☆769Updated this week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆303Updated last year
- A Python Library to support running data quality rules while the spark job is running⚡☆162Updated this week
- Work with your web service, database, and streaming schemas in a single format.☆330Updated 7 months ago
- Pythonic Iceberg REST Catalog☆65Updated last month
- Quick Guides from Dremio on Several topics☆63Updated last week
- ☆150Updated 3 weeks ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- Unity Catalog UI☆39Updated 2 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆187Updated last week
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.☆347Updated this week
- The metrics layer for your data. Join us at https://metriql.com/slack☆298Updated last year
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆144Updated this week
- The Open-Source Enterprise Data Platform in a single Portal☆213Updated this week
- Replicates any database (CDC events) to Apache Iceberg (To Cloud Storage)☆191Updated this week
- A Table format agnostic data sharing framework☆38Updated 9 months ago
- Multi-hop declarative data pipelines☆91Updated this week
- A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture t…☆159Updated this week