goliaro / specinfer-aeLinks
☆26Updated last year
Alternatives and similar repositories for specinfer-ae
Users that are interested in specinfer-ae are comparing it to the libraries listed below
Sorting:
- ☆214Updated 2 months ago
- ☆58Updated last year
- ☆141Updated last week
- Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)☆70Updated 8 months ago
- Curated collection of papers in MoE model inference☆320Updated 2 months ago
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆167Updated last year
- Summary of some awesome work for optimizing LLM inference☆151Updated 3 weeks ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆50Updated last year
- LLM Inference analyzer for different hardware platforms☆97Updated 3 weeks ago
- ☆29Updated 8 months ago
- ☆113Updated 2 years ago
- LLM serving cluster simulator☆127Updated last year
- [HPCA 2026] A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.☆71Updated last week
- Large Language Model (LLM) Serving Paper and Resource List☆24Updated 7 months ago
- Explore Inter-layer Expert Affinity in MoE Model Inference☆16Updated last year
- LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale☆164Updated 5 months ago
- Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)☆24Updated last year
- ☆163Updated last year
- ☆58Updated last year
- ☆126Updated last year
- ☆15Updated last year
- NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing☆105Updated last year
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆74Updated 2 months ago
- ☆82Updated last year
- ☆35Updated last year
- This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…☆259Updated 3 weeks ago
- ☆58Updated 6 months ago
- ☆12Updated last year
- ☆46Updated last year
- ☆21Updated last week