danbider / lora-tradeoffsLinks

Information and artifacts for "LoRA Learns Less and Forgets Less" (TMLR, 2024)

☆16

Alternatives and similar repositories for lora-tradeoffs

Users that are interested in lora-tradeoffs are comparing it to the libraries listed below

Sorting:

epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆84Updated 11 months ago
katiekang1998 / reasoning_generalization
☆33Updated 9 months ago
ablghtianyi / ICL_Modular_Arithmetic
☆19Updated 6 months ago
RobertCsordas / moeut
☆86Updated last year
probabilistic-inference-scaling / probabilistic-inference-scaling
☆51Updated 7 months ago
BorealisAI / flora-opt
This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.
☆103Updated last year
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆107Updated 5 months ago
JacobPfau / fillerTokens
☆72Updated last year
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆166Updated 3 months ago
PiotrNawrot / sparse-frontier
The evaluation framework for training-free sparse attention in LLMs
☆101Updated last week
minyoungg / LTE
☆69Updated last year
fangyuan-ksgk / selective-attention-transformer
Unofficial Implementation of Selective Attention Transformer
☆17Updated 11 months ago
Leiay / looped_transformer
☆32Updated last year
ScalingIntelligence / large_language_monkeys
☆107Updated last year
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆130Updated 10 months ago
wmn-231314 / diffusion-data-constraint
Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…
☆101Updated last month
martin-marek / batch-size
📄Small Batch Size Training for Language Models
☆63Updated 2 weeks ago
ml-jku / EVA
One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation
☆44Updated last year
jonhue / activeft
PyTorch library for Active Fine-Tuning
☆93Updated 3 weeks ago
insuhan / hyper-attn
☆83Updated last year
JinjieNi / dlms-are-super-data-learners
The official github repo for "Diffusion Language Models are Super Data Learners".
☆134Updated 2 weeks ago
Phylliida / MambaLens
Mamba support for transformer lens
☆18Updated last year
mengxiayu / LLMSuperWeight
Code for studying the super weight in LLM
☆120Updated 10 months ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆31Updated 2 years ago
CASE-Lab-UMD / LLM-Drop
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
☆177Updated 6 months ago
allenai / fluid-benchmarking
Fluid Language Model Benchmarking
☆19Updated last month
MadryLab / DsDm
☆50Updated last year
r-three / realistic_evaluation_of_model_merging_for_compositional_generalization
☆12Updated 11 months ago
locuslab / massive-activations
Code accompanying the paper "Massive Activations in Large Language Models"
☆184Updated last year
goombalab / phi-mamba
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…
☆116Updated last year