AlpinDale / gptq-gptj
Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.
☆15Updated 2 years ago
Alternatives and similar repositories for gptq-gptj:
Users that are interested in gptq-gptj are comparing it to the libraries listed below
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- ☆27Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 2 months ago
- 4 bits quantization of SantaCoder using GPTQ☆51Updated last year
- Experimental sampler to make LLMs more creative☆30Updated last year
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆25Updated last year
- ☆73Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 11 months ago
- ☆12Updated 6 months ago
- ☆40Updated 2 years ago
- Code and models for BERT on STILTs☆53Updated 2 years ago
- Score LLM pretraining data with classifiers☆54Updated last year
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆41Updated 2 years ago
- Large-Language-Model to Machine Interface project.☆18Updated last year
- ☆49Updated last year
- An AI character interaction system with emotional modeling and advanced memory management☆15Updated 5 months ago
- The "GPT-API-Accelerate" project provides a set of Python classes for accelerating the process of generating responses to prompts using t…☆23Updated 5 months ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated 11 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- A synthetic story narration dataset to study small audio LMs.☆32Updated last year
- Data preparation code for CrystalCoder 7B LLM☆44Updated 10 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- ☆22Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- Tools for formatting large language model prompts.☆12Updated last year
- ☆53Updated 9 months ago
- ☆26Updated last year
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆36Updated last year
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year