catid / doraLinks

Implementation of DoRA

☆299

Alternatives and similar repositories for dora

Users that are interested in dora are comparing it to the libraries listed below

Sorting:

nikhilgsh / loraplus
☆220Updated last year
Guitaricet / relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
☆458Updated last year
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆198Updated last year
wuhy68 / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)
☆146Updated 10 months ago
pratyushasharma / laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
☆388Updated last year
rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆207Updated last year
dingo-actual / infini-transformer
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…
☆290Updated last year
FasterDecoding / BitDelta
☆199Updated 8 months ago
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Updated 10 months ago
nbasyl / DoRA
Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"
☆124Updated last year
llm-random / llm-random
☆192Updated last week
TencentARC / LLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion.
☆507Updated last year
uukuguy / multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…
☆156Updated last year
NVlabs / Minitron
A family of compressed models obtained via pruning and knowledge distillation
☆347Updated 8 months ago
EricLBuehler / xlora
X-LoRA: Mixture of LoRA Experts
☆232Updated last year
zyushun / Adam-mini
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
☆431Updated 2 months ago
arcee-ai / PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
☆243Updated last year
Cohere-Labs-Community / parameter-efficient-moe
☆269Updated last year
kongds / MoRA
MoRA: High-Rank Updating for Parameter-Efﬁcient Fine-Tuning
☆358Updated 11 months ago
microsoft / rho
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
☆428Updated last year
yxli2123 / LoftQ
☆223Updated last year
yuhuixu1993 / qa-lora
Official PyTorch implementation of QA-LoRA
☆138Updated last year
dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆205Updated last year
qwopqwop200 / gptqlora
GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ
☆103Updated 2 years ago
LLM360 / amber-train
Pre-training code for Amber 7B LLM
☆167Updated last year
Beomi / InfiniTransformer
Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with I…
☆367Updated last year
NVlabs / DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
☆818Updated 10 months ago
GraphPKU / PiSSA
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)
☆367Updated last month
imoneoi / multipack_sampler
Multipack distributed sampler for fast padding-free training of LLMs
☆199Updated 11 months ago
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆135Updated last year