cylondata / cylonLinks
Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
☆301Updated last year
Alternatives and similar repositories for cylon
Users that are interested in cylon are comparing it to the libraries listed below
Sorting:
- Distributed SQL Engine in Python using Dask☆406Updated 10 months ago
- RAPIDS GPU-BDB☆108Updated last year
- Vectorized processing for Apache Arrow☆485Updated 3 years ago
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆230Updated last week
- Ibis Substrait Compiler☆103Updated this week
- Core C++ Sketch Library☆234Updated last month
- Distributed SQL Query Engine in Python using Ray☆243Updated 9 months ago
- Flow with FlorDB 🌻☆154Updated last month
- ☆106Updated 2 years ago
- Pandas ExtensionDType/Array backed by Apache Arrow☆230Updated 2 years ago
- Apache Parquet☆444Updated last year
- RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.☆341Updated last week
- Python binding for DataFusion☆59Updated 2 years ago
- Distributed XGBoost on Ray☆149Updated last year
- A distributed block-based data storage and compute engine☆154Updated 5 months ago
- Ray provider for Apache Airflow☆48Updated last year
- Hops Hadoop is a distribution of Apache Hadoop with distributed metadata.☆316Updated 2 months ago
- Utilities for Dask and CUDA interactions☆311Updated this week
- Ray-based Apache Beam runner☆42Updated last year
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago
- Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tupl…☆809Updated 3 months ago
- [ARCHIVED] C GPU DataFrame Library☆139Updated 6 years ago
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with …☆638Updated 2 weeks ago
- reproducible benchmark of database-like ops☆339Updated 2 years ago
- [ARCHIVED] Dask support for distributed GDF object --> Moved to cudf☆136Updated 6 years ago
- An Aspiring Drop-In Replacement for Pandas at Scale☆74Updated 3 years ago
- Unified Distributed Execution☆54Updated 8 months ago
- Jupyter extensions for SWAN☆58Updated last week
- Apache Arrow Cookbook☆104Updated 2 months ago
- A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.☆159Updated 2 weeks ago