parameterlab / dr-llmLinks

Source code of "Dr.LLM: Dynamic Layer Routing in LLMs"

☆39

Alternatives and similar repositories for dr-llm

Users that are interested in dr-llm are comparing it to the libraries listed below

Sorting:

ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated last year
tianyi-lab / C3PO
[COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"
☆18Updated 7 months ago
facebookresearch / mexma
MEXMA: Token-level objectives improve sentence representations
☆42Updated 10 months ago
ryokamoi / llm-self-correction-papers
List of papers on Self-Correction of LLMs.
☆80Updated 10 months ago
locuslab / scaling_laws_data_filtering
☆65Updated last year
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆57Updated last week
kyegomez / Reka-Torch
Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch
☆29Updated this week
mlfoundations / dataset2metadata
☆27Updated last year
xhan77 / in-context-alignment
In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning
☆35Updated 2 years ago
prateeky2806 / ComPEFT
☆26Updated last year
r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆44Updated last month
srush / LLM-Talk
☆52Updated last year
amazon-science / controllable-readability-summarization
Generating Summaries with Controllable Readability Levels (EMNLP 2023)
☆14Updated 3 months ago
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆36Updated last year
google-research-datasets / swim-ir
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆49Updated 2 years ago
bespokelabsai / verifiers
Verifiers for LLM Reinforcement Learning
☆79Updated 7 months ago
GSYfate / knnlm-limits
Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"
☆24Updated 6 months ago
hadasah / btm
☆76Updated last year
sail-sg / SkyLadder
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆40Updated last month
giangdip2410 / HyperRouter
Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"
☆33Updated last year
ShiZhengyan / PowerfulPromptFT
[NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Lea…
☆76Updated last year
BaohaoLiao / mefts
[NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
☆33Updated 2 years ago
mistralai / mistral-evals
☆78Updated 3 months ago
kyegomez / LM-Infinite
Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆39Updated last year
nverma1 / merging-text-transformers
Code for "Merging Text Transformers from Different Initializations"
☆19Updated 9 months ago
kamanphoebe / Look-into-MoEs
[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆55Updated 9 months ago
GenRobo / MatMamba
Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"
☆61Updated last year
SparkJiao / StructTest
☆19Updated 3 months ago
allenai / easy-to-hard-generalization
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Updated last year
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year