Summary of some awesome work for optimizing LLM inference
☆208Feb 14, 2026Updated 3 weeks ago
Alternatives and similar repositories for LLM-inference-optimization-paper
Users that are interested in LLM-inference-optimization-paper are comparing it to the libraries listed below
Sorting:
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆283Mar 6, 2025Updated last year
- 📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉☆5,040Feb 27, 2026Updated last week
- ☆30May 28, 2024Updated last year
- Curated collection of papers in machine learning systems☆515Feb 7, 2026Updated last month
- A throughput-oriented high-performance serving framework for LLMs☆947Oct 29, 2025Updated 4 months ago
- Curated collection of papers in MoE model inference☆345Oct 20, 2025Updated 4 months ago
- This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…