cylondata / cylon
Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
☆298Updated 8 months ago
Alternatives and similar repositories for cylon:
Users that are interested in cylon are comparing it to the libraries listed below
- RAPIDS GPU-BDB☆108Updated 11 months ago
- Distributed SQL Engine in Python using Dask☆400Updated 6 months ago
- Vectorized processing for Apache Arrow☆484Updated 3 years ago
- Ibis Substrait Compiler☆99Updated this week
- Python bindings for UCX☆126Updated this week
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆194Updated this week
- Pandas ExtensionDType/Array backed by Apache Arrow☆229Updated 2 years ago
- Distributed SQL Query Engine in Python using Ray☆244Updated 5 months ago
- Flow with FlorDB 🌻☆154Updated 3 weeks ago
- Unified Distributed Execution☆51Updated 4 months ago
- Core C++ Sketch Library☆229Updated last week
- ☆105Updated last year
- Apache DataFusion Python Bindings☆418Updated this week
- Utilities for Dask and CUDA interactions☆299Updated this week
- RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.☆326Updated this week
- Apache Parquet☆443Updated 9 months ago
- Deploy dask on YARN clusters☆69Updated 6 months ago
- Ray provider for Apache Airflow☆47Updated last year
- [ARCHIVED] C GPU DataFrame Library☆138Updated 6 years ago
- An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.☆233Updated 5 months ago
- Ray-based Apache Beam runner☆43Updated last year
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆424Updated 3 years ago
- Distributed XGBoost on Ray☆147Updated 8 months ago
- Apache Arrow Cookbook☆101Updated last week
- In-Memory Analytics with Apache Arrow, published by Packt☆96Updated last year
- Point-in-Time optimizations for Apache Spark☆29Updated last year
- [ARCHIVED] Dask support for distributed GDF object --> Moved to cudf☆136Updated 5 years ago
- Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...☆138Updated last month
- A distributed block-based data storage and compute engine☆154Updated 2 weeks ago
- reproducible benchmark of database-like ops☆331Updated last year