ray-project / deltacat
A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your ML and analytics workloads.
☆202Updated last week
Alternatives and similar repositories for deltacat:
Users that are interested in deltacat are comparing it to the libraries listed below
- Ibis Substrait Compiler☆102Updated this week
- ☆250Updated this week
- Distributed SQL Query Engine in Python using Ray☆243Updated 6 months ago
- An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.☆243Updated 6 months ago
- Apache DataFusion Ray☆183Updated last week
- Ray-based Apache Beam runner☆43Updated last year
- RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.☆330Updated this week
- ☆38Updated this week
- Apache DataFusion Python Bindings☆439Updated this week
- Distributed SQL Engine in Python using Dask☆401Updated 7 months ago
- A native Delta implementation for integration with any query engine☆218Updated this week
- DuckDB extension for Delta Lake☆176Updated last week
- BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)☆238Updated last week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆327Updated 2 years ago
- Template for DuckDB extensions to help you develop, test and deploy a custom extension☆183Updated last month
- Open, Multi-modal Catalog for Data & AI, written in Rust☆78Updated 6 months ago
- deferred computational framework for multi-engine pipelines☆220Updated this week
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆110Updated last month
- New file format for storage of large columnar datasets.☆505Updated last week
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- A purely experimental DuckDB Deltalake extension☆95Updated this week
- Database connectivity API standard and libraries for Apache Arrow☆428Updated this week
- Apache DataFusion Benchmarks☆18Updated last week
- Point-in-Time optimizations for Apache Spark☆29Updated last year
- The native Rust implementation for Apache Hudi, with Python API bindings.☆207Updated this week
- Apache Iceberg C++☆58Updated this week
- Rust implementation of Apache Iceberg with integration for Datafusion☆162Updated this week
- Proof-of-concept extension combining the delta extension with Unity Catalog☆81Updated last month
- Apache Arrow Flight SQL adapter for PostgreSQL☆82Updated 3 weeks ago
- ☆68Updated 3 months ago