qwopqwop200 / GPTQ-for-LLaMaLinks

4 bits quantization of LLaMA using GPTQ

☆3,071

Alternatives and similar repositories for GPTQ-for-LLaMa

Users that are interested in GPTQ-for-LLaMa are comparing it to the libraries listed below

Sorting:

IST-DASLab / gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
☆2,196Updated last year
turboderp / exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆2,903Updated 2 years ago
young-geng / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆2,496Updated last year
gururise / AlpacaDataCleaned
Alpaca dataset from Stanford, cleaned and curated
☆1,572Updated 2 years ago
AutoGPTQ / AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆4,951Updated 6 months ago
henrywoo / pyllama
LLaMA: Open and Efficient Foundation Language Models
☆2,800Updated last year
togethercomputer / RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
☆4,827Updated 10 months ago
Instruction-Tuning-with-GPT-4 / GPT-4-LLM
Instruction Tuning with GPT-4
☆4,330Updated 2 years ago
tloen / llama-int8
Quantized inference code for LLaMA models
☆1,047Updated 2 years ago
johnsmith0031 / alpaca_lora_4bit
☆534Updated last year
teknium1 / GPTeacher
A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer
☆1,635Updated 2 years ago
kuleshov-group / llmtools
Finetuning Large Language Models on One Consumer GPU in 2 Bits
☆729Updated last year
henrywoo / chatllama
ChatLLaMA 📢 Open source implementation for LLaMA-based ChatGPT runnable in a single GPU. 15x faster training process than ChatGPT
☆1,203Updated 8 months ago
casper-hansen / AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
☆2,254Updated 5 months ago
RWKV / rwkv.cpp
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
☆1,547Updated 6 months ago
deepspeedai / DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
☆2,063Updated 3 months ago
randaller / llama-chat
Chat with Meta's LLaMA models at home made easy
☆837Updated 2 years ago
sahil280114 / codealpaca
☆1,491Updated 2 years ago
AetherCortex / Llama-X
Open Academic Research on Improving LLaMA to SOTA LLM
☆1,616Updated 2 years ago
yizhongw / self-instruct
Aligning pretrained language models with instruction data generated by themselves.
☆4,495Updated 2 years ago
marella / ctransformers
Python bindings for the Transformer models implemented in C/C++ using GGML library.
☆1,877Updated last year
project-baize / baize-chatbot
Let ChatGPT teach your own chatbot in hours with a single GPU!
☆3,170Updated last year
OpenGVLab / LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,904Updated last year
deep-diver / LLM-As-Chatbot
LLM as a Chatbot Service
☆3,339Updated last year
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,074Updated 3 months ago
OpenLMLab / LOMO
LOMO: LOw-Memory Optimization
☆987Updated last year
stochasticai / xTuring
Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-s…
☆2,659Updated 2 weeks ago
lxe / simple-llm-finetuner
Simple UI for LLM Model Finetuning
☆2,065Updated last year
CarperAI / trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,709Updated last year
bitsandbytes-foundation / bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
☆7,647Updated last week