UbiquitousLearning / Paper-list-resource-efficient-large-language-modelLinks

☆100

Alternatives and similar repositories for Paper-list-resource-efficient-large-language-model

Users that are interested in Paper-list-resource-efficient-large-language-model are comparing it to the libraries listed below

Sorting:

Kyrie-Zhao / awesome-real-time-AI
This is a list of awesome edgeAI inference related papers.
☆97Updated last year
UbiquitousLearning / Efficient_Foundation_Model_Survey
Survey Paper List - Efficient LLM and Foundation Models
☆253Updated 10 months ago
zhengzangw / Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆90Updated 2 years ago
chenhongyu2048 / LLM-inference-optimization-paper
Summary of some awesome work for optimizing LLM inference
☆92Updated 2 months ago
tonyzhao-jt / LLM-PQ
Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"
☆34Updated last month
Relaxed-System-Lab / HexGen
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆27Updated last year
MoE-Inf / awesome-moe-inference
Curated collection of papers in MoE model inference
☆220Updated this week
SymbioticLab / ModelKeeper
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆35Updated 2 years ago
tiingweii-shii / Awesome-Resource-Efficient-LLM-Papers
a curated list of high-quality papers on resource-efficient LLMs 🌱
☆132Updated 4 months ago
Hsword / Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …
☆117Updated last year
galeselee / Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…
☆264Updated 5 months ago
DicardoX / Research-Space
This repository is established to store personal notes and annotated papers during daily research.
☆138Updated last week
TreeAI-Lab / Awesome-KV-Cache-Management
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…
☆172Updated last week
qipengwang / Melon
MobiSys#114
☆21Updated last year
SymbioticLab / Oobleck
A resilient distributed training framework
☆95Updated last year
thu-pacman / FasterMoE
☆86Updated 3 years ago
falcon-xu / early-exit-papers
A curated list of early exiting (LLM, CV, NLP, etc)
☆58Updated 11 months ago
lambda7xx / awesome-AI-system
paper and its code for AI System
☆318Updated 3 months ago
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆83Updated 2 years ago
smart-lty / ParallelSpeculativeDecoding
[ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length
☆102Updated 3 months ago
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆125Updated last year
xumengwei / Edge-AI-Paper-List
☆205Updated last year
LiuXiaoxuanPKU / OSD
☆54Updated 8 months ago
HPMLL / BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆190Updated 2 weeks ago
SJTU-IPADS / disb
DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.
☆53Updated 11 months ago
mutinifni / splitwise-sim
LLM serving cluster simulator
☆108Updated last year
Zefan-Cai / Awesome-LLM-KV-Cache
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
☆343Updated 5 months ago
Hsword / Awesome-Machine-Learning-System-Papers
☆74Updated 3 years ago
UChi-JCL / CacheGen
☆115Updated 9 months ago
hao-ai-lab / vllm-ltr
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆51Updated 9 months ago