projectnessie/nessie

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/projectnessie/nessie)

projectnessie / nessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

☆1,481

Alternatives and similar repositories for nessie

Users that are interested in nessie are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

apache / polaris
View on GitHub
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
☆2,021Updated this week
apache / iceberg
View on GitHub
Apache Iceberg
☆9,070Updated this week
lakekeeper / lakekeeper
View on GitHub
Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.
☆1,395Updated this week
dremio / dremio-oss
View on GitHub
Dremio - the missing link in modern data
☆1,488Sep 26, 2025Updated 9 months ago
substrait-io / substrait
View on GitHub
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
☆1,535Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
apache / iceberg-python
View on GitHub
PyIceberg
☆1,097Updated this week
delta-io / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆8,924Updated this week
trinodb / trino
View on GitHub
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
☆13,061Updated this week
OpenLineage / OpenLineage
View on GitHub
An Open Standard for lineage metadata collection
☆2,557Updated this week
apache / iceberg-rust
View on GitHub
Apache Iceberg
☆1,349Updated this week
treeverse / lakeFS
View on GitHub
lakeFS - Data version control for your data lake | Git for data
☆5,460Updated this week
unitycatalog / unitycatalog
View on GitHub
Open, Multi-modal Catalog for Data & AI
☆3,464Updated this week
apache / datafusion
View on GitHub
Apache DataFusion SQL Query Engine
☆9,005Updated this week
apache / incubator-xtable
View on GitHub
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processin…
☆1,195Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
apache / datafusion-comet
View on GitHub
Apache DataFusion Comet Spark Accelerator
☆1,230Updated this week
SQLMesh / sqlmesh
View on GitHub
Scalable and efficient data transformation framework - backwards compatible with dbt.
☆3,213Updated this week
apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,353Updated this week
databricks / docker-spark-iceberg
View on GitHub
☆383Feb 15, 2026Updated 5 months ago
apache / datafusion-ballista
View on GitHub
Apache DataFusion Ballista Distributed Query Engine
☆2,094Updated this week
projectnessie / iceberg-catalog-migrator
View on GitHub
CLI tool to bulk migrate the tables from one catalog another without a data copy
☆85Apr 12, 2025Updated last year
duckdb / duckdb-iceberg
View on GitHub
☆420Updated this week
MarquezProject / marquez
View on GitHub
Collect, aggregate, and visualize a data ecosystem's metadata
☆2,245Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
apache / gravitino
View on GitHub
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
☆3,114Updated this week
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
apache / hudi
View on GitHub
Upserts, Deletes And Incremental Processing on Big Data.
☆6,192Updated this week
Eventual-Inc / Daft
View on GitHub
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
☆5,651Updated this week
databricks / iceberg-kafka-connect
View on GitHub
☆285Jul 3, 2025Updated last year
apache / amoro
View on GitHub
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
☆1,151Updated this week
delta-io / delta-rs
View on GitHub
A native Rust library for Delta Lake, with bindings into Python
☆3,267Updated this week
facebookincubator / velox
View on GitHub
A composable and fully extensible C++ execution engine library for data management systems.
☆4,176Updated this week
JanKaul / iceberg-rust
View on GitHub
Unofficial rust implementation of Apache Iceberg with integration for Datafusion
☆241Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
projectnessie / nessie-demos
View on GitHub
Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.
☆32Updated this week
sodadata / soda-core
View on GitHub
Data Contracts engine for the modern data stack. https://www.soda.io
☆2,396Updated this week
dbt-labs / dbt-core
View on GitHub
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…
☆13,495Updated this week
linkedin / openhouse
View on GitHub
Open Control Plane for Tables in Data Lakehouse
☆392Updated this week
amundsen-io / amundsen
View on GitHub
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…
☆4,782Jul 1, 2026Updated 3 weeks ago
datahub-project / datahub
View on GitHub
The Context Platform for your Data and AI Stack
☆12,320Updated this week
apache / paimon
View on GitHub
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch …
☆3,346Updated this week