tonyzhao-jt / LLM-PQLinks

Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and Adaptive Quantization"
34Updated 2 weeks ago

Alternatives and similar repositories for LLM-PQ

Users that are interested in LLM-PQ are comparing it to the libraries listed below

Sorting: