jorgecarleitao / datafusion-pythonLinks
A Python library to run analytics workloads with the performance of Rust, the flexibility of Python and O(1) cost in moving data between the two. Uses Apache Arrow in-memory format and respective query engine DataFusion.
☆61Updated 4 years ago
Alternatives and similar repositories for datafusion-python
Users that are interested in datafusion-python are comparing it to the libraries listed below
Sorting:
- Experimental support for serializing DataFusion plans using substrait☆45Updated 2 years ago
- Python binding for DataFusion☆59Updated 2 years ago
- ☆55Updated last year
- Arrow, pydantic style☆83Updated 2 years ago
- Batteries included CLI, TUI, and server implementations for DataFusion.☆156Updated this week
- Query Plan Markup Language☆45Updated last year
- S3 as an ObjectStore for DataFusion☆62Updated 2 years ago
- JSON support for DataFusion (unofficial)☆42Updated last week
- TPC-H benchmark data generation in pure Rust☆91Updated 2 weeks ago
- Fill Apache Arrow record batches from an ODBC data source in Rust.☆72Updated this week
- Derive for arrow2☆66Updated last year
- Generated Rust of Apache Arrow spec☆17Updated 2 years ago
- Rust crate for Substrait: Cross-Language Serialization for Relational Algebra☆71Updated this week
- DataFusion TableProviders for reading data from other systems☆125Updated last week
- Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow☆361Updated 10 months ago
- HDFS based on Java implementation as a remote ObjectStore for DataFusion☆10Updated last year
- A purely experimental DuckDB Deltalake extension☆95Updated last week
- Postgres protocol frontend for DataFusion☆63Updated this week
- Optimizer for DataFusion based on the egg framework☆14Updated 3 years ago
- Embeddable Aggregate Management System for Streams and Queries.☆92Updated 2 months ago
- A reader that buffers ranged calls☆12Updated 3 years ago
- Boring Data Tool☆223Updated last year
- results cache for Apache DataFusion☆29Updated 7 months ago
- Allow DataFusion to resolve queries across remote query engines while pushing down as much compute as possible down.☆135Updated this week
- Serverless query engine☆140Updated 2 years ago
- A set of tools for writing servers that speak PostgreSQL's wire protocol☆93Updated this week
- ☆21Updated last year
- Connecting DataFusion to HDFS based on libhdfs3☆13Updated 3 years ago
- Robust data transformation tool using SQL☆21Updated 2 years ago
- Fast S3 in Python using Rust☆45Updated 2 years ago