catid / dora
Implementation of DoRA
☆282Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for dora
- ☆198Updated 4 months ago
- LoRA and DoRA from Scratch Implementations☆188Updated 8 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆171Updated 3 months ago
- PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)☆261Updated 3 months ago
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆194Updated 6 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆122Updated 6 months ago
- [ACL 2024] Progressive LLaMA with Block Expansion.☆479Updated 5 months ago
- ☆246Updated last year
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆433Updated 6 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆168Updated last month
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆129Updated last month
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆302Updated 6 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆381Updated 5 months ago
- [ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation☆619Updated last month
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆322Updated last week
- MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning☆339Updated 3 months ago
- Expert Specialized Fine-Tuning☆143Updated last month
- ☆175Updated this week
- Official PyTorch implementation of QA-LoRA☆116Updated 7 months ago
- ☆182Updated 3 weeks ago
- For releasing code related to compression methods for transformers, accompanying our publications☆369Updated 3 weeks ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆435Updated 7 months ago
- ☆197Updated 4 months ago
- Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆169Updated this week
- An Open Source Toolkit For LLM Distillation☆350Updated last month
- FuseAI Project☆448Updated 2 months ago
- E5-V: Universal Embeddings with Multimodal Large Language Models☆167Updated 3 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆200Updated 5 months ago
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆351Updated 3 weeks ago
- The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction☆368Updated 4 months ago