apache / datafusion-python
Apache DataFusion Python Bindings
☆400Updated this week
Alternatives and similar repositories for datafusion-python:
Users that are interested in datafusion-python are comparing it to the libraries listed below
- Database connectivity API standard and libraries for Apache Arrow☆393Updated this week
- A native Delta implementation for integration with any query engine☆174Updated this week
- Apache DataFusion Ray☆139Updated 3 weeks ago
- Apache Iceberg☆778Updated this week
- Lakekeeper: A Rust native Iceberg REST Catalog☆377Updated this week
- Distributed SQL Engine in Python using Dask☆396Updated 4 months ago
- An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.☆224Updated 3 months ago
- The native Rust implementation for Apache Hudi, with Python API bindings.☆185Updated this week
- Rust implementation of Apache Iceberg with integration for Datafusion☆133Updated this week
- Apache PyIceberg☆551Updated this week
- Open, Multi-modal Catalog for Data & AI, written in Rust☆76Updated 3 months ago
- ☆191Updated last week
- Quickly view your data☆292Updated last week
- Distributed SQL Query Engine in Python using Ray☆240Updated 3 months ago
- Boring Data Tool☆213Updated 9 months ago
- DuckDB extension for Delta Lake☆152Updated this week
- Apache DataFusion Ballista Distributed Query Engine☆1,618Updated this week
- LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.☆601Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,238Updated this week
- Python bindings for sqlparser-rs☆173Updated 2 months ago
- Apache DataFusion Comet Spark Accelerator☆866Updated this week
- Turning PySpark Into a Universal DataFrame API☆349Updated this week
- Read Apache Arrow batches from ODBC data sources in Python☆61Updated this week
- A purely experimental DuckDB Deltalake extension☆94Updated this week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆315Updated last year
- Template for DuckDB extensions to help you develop, test and deploy a custom extension☆160Updated last week
- New file format for storage of large columnar datasets.☆464Updated this week
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆172Updated this week
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆379Updated this week
- A command line tool to query an ODBC data source and write the result into a parquet file.☆230Updated this week