ray-project / enhancementsLinks

Tracking Ray Enhancement Proposals

☆53

Alternatives and similar repositories for enhancements

Users that are interested in enhancements are comparing it to the libraries listed below

Sorting:

ray-project / mobius
Mobius is an AI infrastructure platform for distributed online learning, including online sample processing, training and serving.
☆98Updated last year
oap-project / raydp
RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.
☆343Updated 3 weeks ago
petuum / adaptdl
Resource-adaptive cluster scheduler for deep learning training.
☆447Updated 2 years ago
feature-store / ralf
☆30Updated 2 years ago
JiahaoYao / awesome-ray
Ray - A curated list of resources: https://github.com/ray-project/ray
☆66Updated last month
pytorch / torchx
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…
☆378Updated last week
ray-project / ray_beam_runner
Ray-based Apache Beam runner
☆41Updated last year
ray-project / plasma
A minimal shared memory object store design
☆53Updated 8 years ago
ray-project / xgboost_ray
Distributed XGBoost on Ray
☆149Updated last year
ray-project / pygloo
Pygloo provides Python bindings for Gloo.
☆21Updated 3 weeks ago
intel / llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
☆128Updated 3 weeks ago
antgroup / ant-ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. AntRay i…
☆135Updated this week
ray-project / distml
Distributed ML Optimizer
☆32Updated 4 years ago
rapidsai / ucx-py
Python bindings for UCX
☆137Updated last week
pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆158Updated last month
ai-dynamo / nixl
NVIDIA Inference Xfer Library (NIXL)
☆502Updated this week
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆259Updated 2 years ago
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆113Updated 2 years ago
ray-project / ray-educational-materials
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
☆422Updated last year
google / nccl-fastsocket
NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.
☆119Updated last year
petuum / autodist
Simple Distributed Deep Learning on TensorFlow
☆133Updated last month
NVIDIA / cuda-checkpoint
CUDA checkpoint and restore utility
☆353Updated 6 months ago
maxpumperla / learning_ray
Notebooks for the O'Reilly book "Learning Ray"
☆313Updated last year
facebookresearch / param
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…
☆147Updated last week
exoshuffle / cloudsort
Exoshuffle-CloudSort
☆26Updated 2 years ago
AdrianBZG / LLM-distributed-finetune
Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …
☆59Updated 2 years ago
NVIDIA-Merlin / distributed-embeddings
distributed-embeddings is a library for building large embedding based models in Tensorflow 2.
☆44Updated last year
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆196Updated last week
NVIDIA-Merlin / HierarchicalKV
HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of…
☆163Updated this week
ucbrise / hypersched
Deadline-based hyperparameter tuning on RayTune.
☆31Updated 5 years ago