erika-n / GPTzip
An implementation of LLMzip using GPT-2
☆12Updated last year
Alternatives and similar repositories for GPTzip:
Users that are interested in GPTzip are comparing it to the libraries listed below
- QuIP quantization☆50Updated 11 months ago
- ☆46Updated last month
- The training notebooks that were similar to the original script used to train TinyMistral.☆19Updated last year
- Model REVOLVER, a human in the loop model mixing system.☆33Updated last year
- EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆246Updated 4 months ago
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆99Updated last year
- ☆107Updated last month
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆41Updated last year
- Modeling code for a BitNet b1.58 Llama-style model.☆23Updated 9 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆54Updated 3 months ago
- Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.☆71Updated last year
- Merge safetensor files using the technique described in "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a…☆76Updated 4 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆140Updated 5 months ago
- ☆40Updated last year
- Reorder-based post-training quantization for large language model☆184Updated last year
- GPT-2 small trained on phi-like data☆65Updated last year
- ☆15Updated 2 months ago
- ☆192Updated 2 months ago
- Train Llama Loras Easily☆30Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆96Updated 4 months ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆150Updated last year
- RWKV-7: Surpassing GPT☆79Updated 3 months ago
- Course Project for COMP4471 on RWKV☆17Updated last year
- PB-LLM: Partially Binarized Large Language Models☆151Updated last year
- ☆68Updated 11 months ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆111Updated last year
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆266Updated last year
- ☆27Updated last year
- [ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache☆272Updated last month