catid / dora
Implementation of DoRA
☆291Updated 10 months ago
Alternatives and similar repositories for dora:
Users that are interested in dora are comparing it to the libraries listed below
- ☆216Updated 9 months ago
- LoRA and DoRA from Scratch Implementations☆200Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆195Updated 8 months ago
- ☆255Updated last year
- ☆182Updated this week
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆230Updated 2 months ago
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆448Updated 11 months ago
- Official PyTorch implementation of QA-LoRA☆130Updated last year
- [ACL 2024] Progressive LLaMA with Block Expansion.☆499Updated 10 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆405Updated 11 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆205Updated 10 months ago
- ☆220Updated 10 months ago
- ☆196Updated 4 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆187Updated 8 months ago
- The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction☆385Updated 9 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆142Updated 6 months ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆150Updated last year
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆230Updated 11 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆457Updated last year
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆156Updated 9 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆174Updated 7 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated 11 months ago
- MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning☆355Updated 8 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆394Updated 10 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆313Updated 4 months ago
- X-LoRA: Mixture of LoRA Experts☆216Updated 8 months ago
- The official evaluation suite and dynamic data release for MixEval.☆234Updated 5 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆212Updated last week
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆313Updated last week
- Low-bit optimizers for PyTorch☆127Updated last year