ray-project / Ray-ConnectLinks
Material for Ray Connect 2024 Conference
☆12Updated last year
Alternatives and similar repositories for Ray-Connect
Users that are interested in Ray-Connect are comparing it to the libraries listed below
Sorting:
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated last month
- The driver for LMCache core to run in vLLM☆58Updated 9 months ago
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. AntRay i…☆157Updated this week
- Modular and structured prompt caching for low-latency LLM inference☆102Updated last year
- LLM Serving Performance Evaluation Harness☆80Updated 8 months ago
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆230Updated this week
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆119Updated last year
- ☆97Updated 7 months ago
- ☆56Updated last year
- ☆57Updated last year
- LLM Inference benchmark☆429Updated last year
- Efficient and easy multi-instance LLM serving☆510Updated 2 months ago
- Omni_Infer is a suite of inference accelerators designed for the Ascend NPU platform, offering native support and an expanding feature se…☆86Updated this week
- TePDist (TEnsor Program DISTributed) is an HLO-level automatic distributed system for DL models.☆97Updated 2 years ago
- Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.☆270Updated 2 years ago
- ☆130Updated 10 months ago
- ☆48Updated last year
- ☆31Updated 7 months ago
- A low-latency & high-throughput serving engine for LLMs☆445Updated last month
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆268Updated 3 months ago
- PyTorch distributed training acceleration framework☆53Updated 3 months ago
- ☆27Updated 7 months ago
- GLake: optimizing GPU memory management and IO transmission.☆489Updated 7 months ago
- Stateful LLM Serving☆88Updated 8 months ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆79Updated last year
- ☆431Updated 2 months ago
- Mobius is an AI infrastructure platform for distributed online learning, including online sample processing, training and serving.☆100Updated last year
- Common recipes to run vLLM☆236Updated last week
- KV cache store for distributed LLM inference☆361Updated last week
- ☆512Updated 2 months ago