delta-incubator / deltaray
Delta reader for the Ray open-source toolkit for building ML applications
☆45Updated last year
Alternatives and similar repositories for deltaray:
Users that are interested in deltaray are comparing it to the libraries listed below
- Unity Catalog UI☆39Updated 6 months ago
- A Minimalistic Rust Implementation of Delta Sharing Server.☆88Updated last week
- A Table format agnostic data sharing framework☆38Updated last year
- ☆55Updated last year
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 7 months ago
- Delta Acceptance Testing☆20Updated 7 months ago
- A platform and cloud-based service for data sharing based on the Delta Sharing protocol.☆21Updated 9 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated 2 weeks ago
- DB API 2 interface for Flight SQL with SQLAlchemy extras.☆37Updated 5 months ago
- A Delta Lake reader for Dask☆49Updated 5 months ago
- Open, Multi-modal Catalog for Data & AI, written in Rust☆77Updated 5 months ago
- Yet Another (Spark) ETL Framework☆20Updated last year
- A library that brings useful functions from various modern database management systems to Apache Spark☆58Updated last year
- Dask integration for Snowflake☆30Updated 3 months ago
- Delta Lake helper methods. No Spark dependency.☆22Updated 6 months ago
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆23Updated last year
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆31Updated 2 years ago
- A dbt adapter for Decodable☆12Updated 3 weeks ago
- PySpark schema generator☆42Updated 2 years ago
- Read Delta tables without any Spark☆47Updated last year
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- Demos of Materialize, the operational data warehouse.☆51Updated last week
- Docker envinroment to stream data from Kafka to Iceberg tables☆25Updated last year
- do-anything, run-anywhere pandas-style pipelines☆94Updated this week
- A Python Library to support running data quality rules while the spark job is running⚡☆176Updated this week
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆94Updated last week
- ☆16Updated 8 months ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆36Updated 4 years ago
- Trino (f.k.a PrestoSQL) dialect for SQLAlchemy.☆25Updated 2 years ago
- Data Catalog for Databases and Data Warehouses☆33Updated last year