zyxxmu / cam

Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference
21Updated 3 months ago

Related projects: