sterlind / GPTQ-for-LLaMaLinks
4 bits quantization of LLaMa using GPTQ
☆12Updated 2 years ago
Alternatives and similar repositories for GPTQ-for-LLaMa
Users that are interested in GPTQ-for-LLaMa are comparing it to the libraries listed below
Sorting:
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated 2 years ago
- Harnessing the Memory Power of the Camelids☆147Updated 2 years ago
- GPT-2 small trained on phi-like data☆67Updated last year
- Train Large Language Models (LLM) using LoRA☆26Updated 2 years ago
- A prompt/context management system☆170Updated 2 years ago
- An Autonomous LLM Agent that runs on Wizcoder-15B☆333Updated last year
- An autonomous AI agent extension for Oobabooga's web ui☆173Updated 2 years ago
- An experimental open-source attempt to make GPT-4 fully autonomous.☆98Updated 2 years ago
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆108Updated 2 years ago
- Text WebUI extension to add clever Notebooks to Chat mode☆143Updated 2 months ago
- ☆166Updated 2 years ago
- ☆12Updated last year
- Memoria is a human-inspired memory architecture for neural networks.☆76Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆76Updated last year
- An OpenAI-like LLaMA inference API☆113Updated 2 years ago
- Conversion script adapting vicuna dataset into alpaca format for use with oobabooga's trainer☆12Updated 2 years ago
- Merge Transformers language models by use of gradient parameters.☆207Updated last year
- A guidance language for controlling large language models.☆45Updated 2 years ago
- An unsupervised model merging algorithm for Transformers-based language models.☆106Updated last year
- A multimodal, function calling powered LLM webui.☆216Updated last year
- A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion.☆308Updated 2 years ago
- BabyAGI to run with locally hosted models using the API from https://github.com/oobabooga/text-generation-webui☆87Updated 2 years ago
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆146Updated 2 years ago
- ☆73Updated 2 years ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆160Updated 2 years ago
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆28Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Updated 2 years ago
- A discord bot that roleplays!☆150Updated 2 years ago
- ☆415Updated last year
- Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.☆70Updated 2 years ago