parameterlab / dr-llmLinks
Source code of "Dr.LLM: Dynamic Layer Routing in LLMs"
☆41Updated 3 months ago
Alternatives and similar repositories for dr-llm
Users that are interested in dr-llm are comparing it to the libraries listed below
Sorting:
- ☆27Updated last year
- ☆26Updated 2 years ago
- MEXMA: Token-level objectives improve sentence representations☆42Updated last year
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆28Updated last week
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆19Updated 9 months ago
- List of papers on Self-Correction of LLMs.☆80Updated last year
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆26Updated last week
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆61Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆58Updated last week
- Generating Summaries with Controllable Readability Levels (EMNLP 2023)☆14Updated 6 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆62Updated last year
- ☆64Updated last year
- Official repo of Progressive Data Expansion: data, code and evaluation☆29Updated 2 years ago
- HGRN2: Gated Linear RNNs with State Expansion☆56Updated last year
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆35Updated 2 years ago
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆28Updated 6 months ago
- ☆82Updated 2 months ago
- A repository for research on medium sized language models.☆77Updated last year
- An automated data pipeline scaling RL to pretraining levels☆73Updated 3 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆91Updated last year
- ☆53Updated 2 years ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆31Updated last year
- Code for T-MARS data filtering☆35Updated 2 years ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆80Updated last year
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Updated 2 years ago
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆23Updated 9 months ago
- Project for SNARE benchmark☆11Updated last year
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Updated 2 years ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Updated last year