jd-opensource / xllm-serviceLinks
A flexible serving framework that delivers efficient and fault-tolerant LLM inference for clustered deployments.
☆51Updated this week
Alternatives and similar repositories for xllm-service
Users that are interested in xllm-service are comparing it to the libraries listed below
Sorting:
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆1,316Updated last week
- An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models☆1,862Updated this week
- A high-performance distributed deep learning system targeting large-scale and automated distributed training.☆320Updated last month
- Best practice for training LLaMA models in Megatron-LM☆660Updated last year
- A high-performance inference engine for LLMs, optimized for diverse AI accelerators.☆199Updated this week
- Disaggregated serving system for Large Language Models (LLMs).☆683Updated 4 months ago
- Must-read papers on improving efficiency for LLM serving clusters☆31Updated 3 months ago
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆844Updated last month
- DLRover: An Automatic Distributed Deep Learning System☆1,533Updated last week
- FlagScale is a large model toolkit based on open-sourced projects.☆349Updated this week
- Community maintained hardware plugin for vLLM on Ascend☆1,067Updated this week
- A flexible and efficient training framework for large-scale alignment tasks☆416Updated this week
- Fast inference from large lauguage models via speculative decoding☆811Updated last year
- ☆25Updated 5 months ago
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆906Updated this week
- Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you hav…☆22Updated 5 months ago
- An Efficient "Factory" to Build Multiple LoRA Adapters☆337Updated 6 months ago
- This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…☆192Updated last month
- LongBench v2 and LongBench (ACL 25'&24')☆953Updated 7 months ago
- Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.☆356Updated 6 months ago
- Examples for Recommenders - easy to train and deploy on accelerated infrastructure.☆124Updated this week
- My learning notes/codes for ML SYS.☆3,515Updated this week
- A PyTorch Native LLM Training Framework☆861Updated last month
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆525Updated last month
- [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.☆471Updated last year
- HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling☆475Updated last week
- ☆608Updated 3 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆309Updated 4 months ago
- Adaptive Topology Reconstruction for Robust Graph Representation Learning [Efficient ML Model]☆10Updated 6 months ago
- A large-scale simulation framework for LLM inference☆428Updated last month