spcl / QuaRot

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
284Updated 3 months ago

Related projects

Alternatives and complementary repositories for QuaRot