AutonomicPerfectionist / PipeInfer
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
☆28Updated 6 months ago
Alternatives and similar repositories for PipeInfer
Users that are interested in PipeInfer are comparing it to the libraries listed below
Sorting:
- ☆45Updated 10 months ago
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"