microsoft / mttlLinks

Building modular LMs with parameter-efficient fine-tuning.

☆112

Alternatives and similar repositories for mttl

Users that are interested in mttl are comparing it to the libraries listed below

Sorting:

prateeky2806 / ties-merging
☆192Updated last year
bloomberg / dataless-model-merging
Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)
☆91Updated 2 years ago
roeehendel / icl_task_vectors
☆97Updated last year
microsoft / AdaMix
This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.1…
☆134Updated 2 years ago
socialfoundations / tttlm
Test-time-training on nearest neighbors for large language models
☆46Updated last year
QingruZhang / PASTA
PASTA: Post-hoc Attention Steering for LLMs
☆123Updated 10 months ago
IBM / SALMON
Self-Alignment with Principle-Following Reward Models
☆165Updated 2 weeks ago
lmarena / PPE
☆52Updated 4 months ago
ericwtodd / function_vectors
Function Vectors in Large Language Models (ICLR 2024)
☆180Updated 5 months ago
mmatena / model_merging
☆75Updated 3 years ago
microsoft / deep-language-networks
We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts…
☆94Updated last year
logix-project / logix
AI Logging for Interpretability and Explainability🔬
☆128Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
stanfordnlp / axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆134Updated 3 months ago
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆123Updated last year
locuslab / massive-activations
Code accompanying the paper "Massive Activations in Large Language Models"
☆183Updated last year
Cohere-Labs-Community / parameter-efficient-moe
☆269Updated last year
neulab / gemini-benchmark
☆150Updated last year
benzakenelad / BitFit
Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
☆142Updated 3 years ago
HazyResearch / skill-it
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
☆47Updated last year
lyy1994 / awesome-data-contamination
The Paper List on Data Contamination for Large Language Models Evaluation.
☆100Updated last month
ucl-dark / llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
☆117Updated last year
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆81Updated 9 months ago
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆97Updated 4 years ago
ryoungj / ObsScaling
[NeurIPS'24 Spotlight] Observational Scaling Laws
☆57Updated last year
facebookresearch / RLCD
Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment
☆69Updated 2 years ago
princeton-nlp / Edge-Pruning
[NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".
☆60Updated last month
hadasah / btm
☆76Updated last year
allenai / hyper-task-descriptions
Learning adapter weights from task descriptions
☆19Updated last year
WHGTyen / BIG-Bench-Mistake
A dataset of LLM-generated chain-of-thought steps annotated with mistake location.
☆82Updated last year