apache / hudi-rs
The native Rust implementation for Apache Hudi, with Python API bindings.
☆194Updated last week
Alternatives and similar repositories for hudi-rs:
Users that are interested in hudi-rs are comparing it to the libraries listed below
- Apache DataFusion Ray☆158Updated this week
- A collection of RBIR projects and posts for anyone interested in joining this journey.☆223Updated this week
- Pure Rust Iceberg Implementation☆164Updated 6 months ago
- A native Delta implementation for integration with any query engine☆188Updated this week
- Rust implementation of Apache Iceberg with integration for Datafusion☆145Updated this week
- Apache Iceberg☆821Updated this week
- Apache Paimon Rust The rust implementation of Apache Paimon.☆110Updated 4 months ago
- Lakekeeper: A Rust native Iceberg REST Catalog☆448Updated this week
- Open, Multi-modal Catalog for Data & AI, written in Rust☆76Updated 4 months ago
- Apache Spark Connect Client for Rust☆101Updated 2 weeks ago
- ☆41Updated this week
- Apache DataFusion Comet Spark Accelerator☆890Updated this week
- Apache DataFusion Python Bindings☆414Updated this week
- Allow DataFusion to resolve queries across remote query engines while pushing down as much compute as possible down.☆96Updated this week
- DataFusion TableProviders for reading data from other systems☆81Updated this week
- ☆215Updated this week
- Distributed SQL Query Engine in Python using Ray☆243Updated 4 months ago
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆390Updated 3 weeks ago
- A User-Defined Function Framework for Apache Arrow.☆86Updated last week
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆75Updated this week
- Pythonic Iceberg REST Catalog☆72Updated 5 months ago
- Boring Data Tool☆213Updated 10 months ago
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆227Updated this week
- A collection of demonstrations showcasing how stream processing can be used to solve real-world problems.☆100Updated last week
- A Spark Connector that reads data from / writes data to Arrow-Flight end-points with Arrow-Flight and Flight-SQL☆39Updated 4 months ago
- Database connectivity API standard and libraries for Apache Arrow☆407Updated this week
- High-performance Stream Processing Framework. An alternative to Apache Flink.☆444Updated last year
- Implements a gateway that speaks the SparkConnect protocol and drives a backend using Substrait (over ADBC Flight SQL).☆16Updated last week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆318Updated last year
- Gluten: Plugin to Boost Trino's Performance☆70Updated last year