dvlab-research / Q-LLM

This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"
45Updated 6 months ago

Alternatives and similar repositories for Q-LLM:

Users that are interested in Q-LLM are comparing it to the libraries listed below