ray-project / Ray-ConnectLinks
Material for Ray Connect 2024 Conference
☆12Updated last year
Alternatives and similar repositories for Ray-Connect
Users that are interested in Ray-Connect are comparing it to the libraries listed below
Sorting:
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. AntRay i…☆165Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 4 months ago
- TePDist (TEnsor Program DISTributed) is an HLO-level automatic distributed system for DL models.☆99Updated 2 years ago
- The driver for LMCache core to run in vLLM☆60Updated last year
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆263Updated this week
- Modular and structured prompt caching for low-latency LLM inference☆110Updated last year
- Mobius is an AI infrastructure platform for distributed online learning, including online sample processing, training and serving.☆100Updated last year
- GLake: optimizing GPU memory management and IO transmission.☆497Updated 10 months ago
- Some resources about Ray Forward Meetup☆30Updated last month
- ☆56Updated last year
- PyTorch distributed training acceleration framework☆55Updated 5 months ago
- A high-performance serving system for DeepRec based on TensorFlow Serving.☆19Updated 2 years ago
- ☆523Updated 2 weeks ago
- Toolchain built around the Megatron-LM for Distributed Training☆84Updated 2 months ago
- Efficient and easy multi-instance LLM serving☆524Updated 5 months ago
- OpenEmbedding is an open source framework for Tensorflow distributed training acceleration.☆33Updated 2 years ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆120Updated last year
- ☆20Updated 8 months ago
- Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.☆271Updated 2 years ago
- ☆27Updated 9 months ago
- KV cache store for distributed LLM inference☆390Updated 2 months ago
- Stateful LLM Serving☆95Updated 10 months ago
- LLM Serving Performance Evaluation Harness☆83Updated 11 months ago
- ☆61Updated last year
- A low-latency & high-throughput serving engine for LLMs☆470Updated last month
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆135Updated last year
- A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from trainin…☆131Updated last month
- ☆48Updated last year
- Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond☆773Updated this week
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆209Updated last year