lyogavin / airllmLinks
AirLLM 70B inference with single 4GB GPU
☆5,777Updated 3 weeks ago
Alternatives and similar repositories for airllm
Users that are interested in airllm are comparing it to the libraries listed below
Sorting:
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆6,427Updated this week
- Python bindings for llama.cpp☆9,168Updated 3 weeks ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,187Updated 2 weeks ago
- Tools for merging pretrained large language models.☆5,754Updated last week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,856Updated last month
- SGLang is a fast serving framework for large language models and vision language models.☆14,667Updated this week
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,562Updated last year
- Large Language Model Text Generation Inference☆10,155Updated this week
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,111Updated 2 months ago
- LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…☆3,247Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,504Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,446Updated 11 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆48,531Updated this week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,985Updated last week
- Go ahead and axolotl questions☆9,470Updated this week
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆8,226Updated this week
- Accessible large language models via k-bit quantization for PyTorch.☆7,088Updated this week
- The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.☆5,938Updated 9 months ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,411Updated 9 months ago
- The RedPajama-Data repository contains code for preparing large datasets for training large language models.☆4,727Updated 5 months ago
- High-speed Large Language Model Serving for Local Deployment☆8,213Updated 3 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,712Updated this week
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizati…☆10,586Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆8,967Updated last week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆17,395Updated this week
- Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.☆11,304Updated this week
- Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).☆6,917Updated 3 months ago
- An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)☆4,568Updated this week
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:☆2,174Updated 2 weeks ago
- Run GGUF models easily with a KoboldAI UI. One File. Zero Install.☆7,426Updated this week