xorbitsai / xorbits
Scalable Python DS & ML, in an API compatible & lightning fast way.
β1,125Updated this week
Related projects β
Alternatives and complementary repositories for xorbits
- Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.β2,703Updated 10 months ago
- ποΈ Reproducible development environmentβ2,024Updated last month
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewβ¦β2,005Updated last month
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vβ¦β3,929Updated this week
- RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.β313Updated 3 months ago
- An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and moreβ717Updated this week
- A JupyterLab extension for displaying cell timingsβ370Updated last month
- Build and share data reports in 100% Pythonβ1,381Updated last year
- BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.β1,933Updated 2 years ago
- A toolkit to run Ray applications on Kubernetesβ1,245Updated this week
- Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplβ¦β810Updated 7 months ago
- Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet fβ¦β1,798Updated 11 months ago
- Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn applicationβ1,224Updated this week
- Fastest library to load data from DB to DataFrames in Rust and Pythonβ1,995Updated this week
- Extended pickling support for Python objectsβ1,656Updated 3 weeks ago
- A Python package for easy multiprocessing, but faster than multiprocessingβ2,015Updated 3 months ago
- Clean APIs for data cleaning. Python implementation of R package Janitorβ1,357Updated this week
- Lightweight and extensible compatibility layer between dataframe libraries!β556Updated this week
- Distributed XGBoost on Rayβ143Updated 4 months ago
- Pandas DataFrames as Interactive DataTablesβ795Updated last week
- A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViewsβ1,130Updated this week
- Real-time stream processing for pythonβ1,240Updated 4 months ago
- Fast NumPy array functions written in Cβ1,072Updated 3 weeks ago
- Flexible Python configuration system. The last one you will ever need.β1,974Updated 5 months ago
- Temporian is an open-source Python library for preprocessing β‘ and feature engineering π temporal data π for machine learning applicatiβ¦β674Updated 3 months ago
- EvalML is an AutoML library written in python.β774Updated this week
- Intake is a lightweight package for finding, investigating, loading and disseminating data.β1,011Updated last month
- Making data lake work for time seriesβ1,136Updated 2 months ago
- A package which efficiently applies any function to a pandas dataframe or series in the fastest available mannerβ2,534Updated 7 months ago
- Python package for statistical data animationsβ347Updated last year