modular-ml / wrapyfi-examples_llama
Inference code for facebook LLaMA models with Wrapyfi support
☆130Updated last year
Related projects: ⓘ
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆405Updated 9 months ago
- ☆533Updated 9 months ago
- ☆453Updated 11 months ago
- Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.☆50Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated last year
- 4 bits quantization of SantaCoder using GPTQ☆54Updated last year
- Tune any FALCON in 4-bit☆469Updated last year
- Tune MPTs☆84Updated last year
- Inference script for Meta's LLaMA models using Hugging Face wrapper☆110Updated last year
- Just a simple HowTo for https://github.com/johnsmith0031/alpaca_lora_4bit☆30Updated last year
- Simple, hackable and fast implementation for training/finetuning medium-sized LLaMA-based models☆143Updated last week
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆109Updated last year
- A dataset featuring diverse dialogues between two ChatGPT (gpt-3.5-turbo) instances with system messages written by GPT-4. Covering vario…☆165Updated last year
- ☆338Updated last year
- Merge Transformers language models by use of gradient parameters.☆193Updated last month
- 4 bits quantization of LLaMa using GPTQ☆129Updated last year
- A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick☆283Updated 9 months ago
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆96Updated last year
- Instruct-tuning LLaMA on consumer hardware☆66Updated last year
- Inference code for LLaMA models☆45Updated last year
- Framework agnostic python runtime for RWKV models☆144Updated last year
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆139Updated 7 months ago
- OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA☆301Updated last year
- This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such as…☆347Updated last year
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"☆339Updated 6 months ago
- Fine-tune SantaCoder for Code/Text Generation.☆182Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆107Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆170Updated last month
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆423Updated 4 months ago