cylondata / cylon
Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
☆299Updated 9 months ago
Alternatives and similar repositories for cylon:
Users that are interested in cylon are comparing it to the libraries listed below
- RAPIDS GPU-BDB☆108Updated last year
- Distributed SQL Engine in Python using Dask☆403Updated 7 months ago
- Ibis Substrait Compiler☆100Updated this week
- Vectorized processing for Apache Arrow☆484Updated 3 years ago
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆198Updated this week
- Core C++ Sketch Library☆230Updated last month
- ☆105Updated last year
- Python bindings for UCX☆126Updated last week
- Distributed SQL Query Engine in Python using Ray☆243Updated 6 months ago
- Pandas ExtensionDType/Array backed by Apache Arrow☆229Updated 2 years ago
- A distributed block-based data storage and compute engine☆154Updated last month
- Flow with FlorDB 🌻☆155Updated last month
- Unified Distributed Execution☆51Updated 5 months ago
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆425Updated 3 years ago
- Utilities for Dask and CUDA interactions☆302Updated last week
- Python binding for DataFusion☆59Updated 2 years ago
- A composable framework for fast and scalable data analytics☆57Updated 2 years ago
- A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.☆146Updated last week
- Apache DataFusion Python Bindings☆429Updated last week
- A library that provides useful extensions to Apache Spark and PySpark.☆221Updated last week
- An Aspiring Drop-In Replacement for Pandas at Scale☆75Updated 3 years ago
- Distributed XGBoost on Ray☆147Updated 9 months ago
- Apache Arrow Cookbook☆101Updated last month
- Apache Parquet☆443Updated 10 months ago
- BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)☆238Updated 10 months ago
- RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.☆328Updated 2 weeks ago
- User tools for Spark RAPIDS☆61Updated last week
- Hops Hadoop is a distribution of Apache Hadoop with distributed metadata.☆311Updated 2 months ago
- An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.☆241Updated 6 months ago
- Ray provider for Apache Airflow☆47Updated last year