facebookresearch / any4Links
Quantize transformers to any learned arbitrary 4-bit numeric format
☆49Updated 4 months ago
Alternatives and similar repositories for any4
Users that are interested in any4 are comparing it to the libraries listed below
Sorting:
- Framework to reduce autotune overhead to zero for well known deployments.☆85Updated 2 months ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Updated last year
- ☆109Updated 6 months ago
- ☆71Updated 7 months ago
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆25Updated 4 months ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Updated 4 months ago
- ☆83Updated 9 months ago
- Transformers components but in Triton☆34Updated 6 months ago
- Autonomous GPU Kernel Generation via Deep Agents☆137Updated this week