BlazingDB / blazingsqlLinks
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
☆1,987Updated 3 years ago
Alternatives and similar repositories for blazingsql
Users that are interested in blazingsql are comparing it to the libraries listed below
Sorting:
- A GPU-powered real-time analytics storage and query engine.☆3,065Updated last year
- HeavyDB (formerly MapD/OmniSciDB)☆3,032Updated this week
- Dremio - the missing link in modern data☆1,445Updated 3 weeks ago
- The Universal Storage Engine☆1,991Updated this week
- ZetaSQL - Analyzer Framework for SQL☆2,426Updated 3 weeks ago
- Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet f…☆1,861Updated last month
- Distributed Computing for AI Made Simple☆1,048Updated 2 years ago
- Hopsworks - Data-Intensive AI platform with a Feature Store☆1,258Updated 8 months ago
- Vectorized processing for Apache Arrow☆485Updated 3 years ago
- Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.☆2,746Updated last year
- Parsing and analysis of Vertica, Hive, and Presto SQL.☆1,076Updated 3 years ago
- A composable and fully extensible C++ execution engine library for data management systems.☆3,918Updated this week
- Spark RAPIDS plugin - accelerate Apache Spark with GPUs☆939Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,399Updated last week
- A Redis module for serving tensors and executing deep learning graphs☆839Updated 2 months ago
- vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)☆927Updated 2 months ago
- Apache Drill is a distributed MPP query layer for self describing data☆1,994Updated last month
- Mirror of Apache MADlib☆467Updated last year
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆301Updated last year
- Distributed SQL Engine in Python using Dask☆408Updated last year
- Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tupl…☆815Updated 2 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,116Updated 6 months ago
- ☆1,676Updated this week
- Apache Parquet Format☆2,070Updated last week
- Apache Impala☆1,244Updated last week
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, v…☆5,514Updated this week
- TonY is a framework to natively run deep learning frameworks on Apache Hadoop.