unitycatalog / unitycatalogLinks
Open, Multi-modal Catalog for Data & AI
☆2,920Updated last week
Alternatives and similar repositories for unitycatalog
Users that are interested in unitycatalog are comparing it to the libraries listed below
Sorting:
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,536Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,233Updated this week
- An Open Standard for lineage metadata collection☆1,965Updated this week
- An open protocol for secure data sharing☆842Updated last week
- Apache PyIceberg☆776Updated this week
- Apache DataFusion Comet Spark Accelerator☆964Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,075Updated this week
- Scalable and efficient data transformation framework - backwards compatible with dbt.☆2,404Updated this week
- A native Rust library for Delta Lake, with bindings into Python☆2,827Updated this week
- Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.☆727Updated this week
- dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)☆1,091Updated last week
- World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.☆1,614Updated this week
- Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.☆1,366Updated this week
- Collect, aggregate, and visualize a data ecosystem's metadata☆1,930Updated this week
- Apache Iceberg☆7,584Updated this week
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,431Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,332Updated last week
- Apache DataFusion Ballista Distributed Query Engine☆1,775Updated 2 weeks ago
- Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.☆1,479Updated this week
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spa…☆762Updated last week
- Open Control Plane for Tables in Data Lakehouse☆354Updated this week
- Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code☆967Updated this week
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆2,201Updated this week
- DuckLake is an integrated data lake and catalog format☆1,560Updated this week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆2,118Updated this week
- Python API for Deequ☆773Updated 2 months ago
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆840Updated 3 weeks ago
- Turning PySpark Into a Universal DataFrame API☆405Updated this week
- Home of the Open Data Contract Standard (ODCS).☆505Updated 3 weeks ago
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆431Updated 4 months ago