antgroup / ant-rayLinks

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. AntRay is forked from ray, offering incremental new features on top of the community version.

☆133

Alternatives and similar repositories for ant-ray

Users that are interested in ant-ray are comparing it to the libraries listed below

Sorting:

bytedance / primus
☆216Updated 2 years ago
ray-project / mobius
Mobius is an AI infrastructure platform for distributed online learning, including online sample processing, training and serving.
☆98Updated last year
bytedance / InfiniStore
KV cache store for distributed LLM inference
☆288Updated last month
antgroup / glake
GLake: optimizing GPU memory management and IO transmission.
☆470Updated 3 months ago
AlibabaPAI / llumnix
Efficient and easy multi-instance LLM serving
☆448Updated this week
alibaba / EasyParallelLibrary
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
☆267Updated 2 years ago
alibaba / TePDist
TePDist (TEnsor Program DISTributed) is an HLO-level automatic distributed system for DL models.
☆94Updated 2 years ago
aliyun / alibabacloud-jindodata
alibabacloud-jindodata
☆196Updated this week
sgl-project / ome
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
☆156Updated this week
ray-project / enhancements
Tracking Ray Enhancement Proposals
☆53Updated 3 months ago
kubedl-io / kubedl
Run your deep learning workloads on Kubernetes more easily and efficiently.
☆525Updated last year
ai-dynamo / nixl
NVIDIA Inference Xfer Library (NIXL)
☆459Updated this week
kubedl-io / morphling
Automatic tuning for ML model deployment on Kubernetes
☆80Updated 8 months ago
4paradigm / OpenEmbedding
OpenEmbedding is an open source framework for Tensorflow distributed training acceleration.
☆32Updated 2 years ago
elasticdeeplearning / edl
Elastic Deep Learning for deep learning framework on Kubernetes
☆173Updated 2 years ago
antgroup / vsag
vsag is a vector indexing library used for similarity search.
☆324Updated this week
HFAiLab / ffrecord
FireFlyer Record file format, writer and reader for DL training samples.
☆227Updated 2 years ago
oap-project / raydp
RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.
☆341Updated last week
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆809Updated last month
DeepRec-AI / HybridBackend
A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
☆158Updated last year
PaddlePaddle / PaddleFlow
☆123Updated 4 months ago
baidu / puck
Puck is a high-performance ANN search engine
☆360Updated last month
microsoft / hivedscheduler
Kubernetes Scheduler for Deep Learning
☆263Updated 3 years ago
zw0610 / zw0610.github.io
☆58Updated 4 years ago
AlibabaPAI / torchacc
PyTorch distributed training acceleration framework
☆51Updated 5 months ago
Tencent / KsanaLLM
☆455Updated last week
volcengine / veScale
A PyTorch Native LLM Training Framework
☆829Updated 6 months ago
alibaba / GPU-scheduler-for-deep-learning
GPU-scheduler-for-deep-learning
☆208Updated 4 years ago
Qihoo360 / dgl-operator
The DGL Operator makes it easy to run Deep Graph Library (DGL) graph neural network training on Kubernetes
☆44Updated 3 years ago
kleveross / ftlib
Fault-tolerant for DL frameworks
☆70Updated 2 years ago