lyogavin / airllmLinks
AirLLM 70B inference with single 4GB GPU
☆6,450Updated 3 months ago
Alternatives and similar repositories for airllm
Users that are interested in airllm are comparing it to the libraries listed below
Sorting:
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆7,357Updated this week
- Tools for merging pretrained large language models.☆6,533Updated last week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆5,001Updated 7 months ago
- Go ahead and axolotl questions☆10,911Updated this week
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,378Updated 3 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆3,552Updated 6 months ago
- Accessible large language models via k-bit quantization for PyTorch.☆7,801Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,000Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,778Updated last year
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.