UW-Madison-Lee-Lab / Expressive_Power_of_LoRALinks

Code for "The Expressive Power of Low-Rank Adaptation".

☆20

Alternatives and similar repositories for Expressive_Power_of_LoRA

Users that are interested in Expressive_Power_of_LoRA are comparing it to the libraries listed below

Sorting:

allenbai01 / transformers-as-statisticians
☆34Updated 2 years ago
ethz-spylab / superhuman-ai-consistency
☆30Updated 2 years ago
dangxingyu / rnn-icrag
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Updated last year
ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆59Updated 2 years ago
gregorbachmann / Next-Token-Failures
☆103Updated last year
formll / resolving-scaling-law-discrepancies
☆20Updated last year
bdusell / stack-attention
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Updated last year
r-three / realistic_evaluation_of_model_merging_for_compositional_generalization
☆12Updated last year
janphilippfranken / sami
Self-Supervised Alignment with Mutual Information
☆21Updated last year
p-lambda / incontext-learning
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…
☆106Updated last year
cassidylaidlaw / orpo
☆19Updated 11 months ago
abhishekpanigrahi1996 / transformer_in_transformer
☆45Updated 2 years ago
BYU-PCCL / prompt-compression-contrastive-coding
Companion repository to "Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models"
☆14Updated 2 years ago
yikangshen / megablocks
☆20Updated last year
emalach / LinearLM
Code for the paper: https://arxiv.org/pdf/2309.06979.pdf
☆21Updated last year
srush / mamba-scans
Blog post
☆17Updated last year
matchten / LoRA-Models-for-SAEs
Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"
☆17Updated 6 months ago
xiamengzhou / training_trajectory_analysis
[ACL 2023]: Training Trajectories of Language Models Across Scales https://arxiv.org/pdf/2212.09803.pdf
☆25Updated last year
snap-stanford / zeroc
ZeroC is a neuro-symbolic method that trained with elementary visual concepts and relations, can zero-shot recognize and acquire more com…
☆32Updated 2 years ago
kdu4108 / semiring-backprop-exps
☆16Updated 2 years ago
princeton-nlp / LM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643
☆78Updated 2 years ago
radarFudan / Curse-of-memory
Curse-of-memory phenomenon of RNNs in sequence modelling
☆18Updated 5 months ago
Silent-Zebra / twisted-smc-lm
☆32Updated 7 months ago
tianjunz / TEMPERA
☆46Updated 2 years ago
hamishivi / automated-instruction-selection
Exploration of automated dataset selection approaches at large scales.
☆48Updated 7 months ago
jiahai-feng / binding-iclr
☆16Updated last year
IBM / ColPret
Efficient Scaling laws and collaborative pretraining.
☆18Updated last month
nathanhu0 / CaMeLS
Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.
☆25Updated last year
MaximeRobeyns / bayesian_lora
Bayesian Low-Rank Adaptation for Large Language Models
☆36Updated last year
sustcsonglin / gated_linear_attention_layer
☆31Updated last year