CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths)

CASE-Lab-UMD / Router-Tuning-Mixture-of-Depths

The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for Enabling Dynamic Depth in Transformers. (EMNLP 2025)"

☆31

Alternatives and similar repositories for Router-Tuning-Mixture-of-Depths

Users that are interested in Router-Tuning-Mixture-of-Depths are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CASE-Lab-UMD / Capacity-Aware-MoE
View on GitHub
The official implementation of the paper "Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts" (ICLR 2026).
☆20May 31, 2026Updated last month
Shwai-He / VLM-Compression
View on GitHub
The official implementation of the paper "Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity".
☆17Jul 2, 2024Updated 2 years ago
Shwai-He / MEO
View on GitHub
The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":
☆47Feb 28, 2026Updated 4 months ago
CASE-Lab-UMD / Unified-MoE-Compression
View on GitHub
The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".
☆89Feb 28, 2026Updated 4 months ago
Hai-chao-Zhang / OOSTraj
View on GitHub
[CVPR24] OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising
☆16Apr 4, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Hai-chao-Zhang / VQToken
View on GitHub
[NeurIPS 2025] Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
☆17Nov 10, 2025Updated 8 months ago
Yibin-Lei / CSQE
View on GitHub
Implementation for EACL 2024 paper "Corpus-Steered Query Expansion with Large Language Models"
☆13Mar 19, 2024Updated 2 years ago
PKU-SEC-Lab / AdapMoE
View on GitHub
Code release for AdapMoE accepted by ICCAD 2024
☆39Apr 28, 2025Updated last year
Timothyxxx / NeuralSymbolicPapers
View on GitHub
☆14Aug 18, 2022Updated 3 years ago
ziyaow1010 / vla-datasets-benchmarks
View on GitHub
A curated list of datasets and benchmarks for Vision-Language-Action (VLA) research, with a focus on evaluation protocols and practical g…
☆31Apr 28, 2026Updated 2 months ago
microsoft / SparseMixer
View on GitHub
Sparse Backpropagation for Mixture-of-Expert Training
☆30Jul 2, 2024Updated 2 years ago
ChenZiHong-Gavin / MoE-Visualizer
View on GitHub
MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.
☆16Apr 8, 2025Updated last year
DaizeDong / Easier-PS-and-SoP
View on GitHub
A LaTeX framework to handle Personal Statement (PS) and Statement of Purpose (SoP) for multiple university applications.
☆25Updated this week
guanchuwang / Taylor-Unswift
View on GitHub
☆22Oct 3, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
naturalconv / NaturalConvDataSet
View on GitHub
☆22Mar 19, 2021Updated 5 years ago
cambridge-mlg / SPVAE
View on GitHub
Tensorflow code for "Hierarchical Decompositional Mixtures of Variational Autoencoders" (ICML'19)
☆12Jun 7, 2020Updated 6 years ago
6zHAOyi / BadVision
View on GitHub
This is an official code repository for CVPR 2025 paper BadVision.
☆15Nov 18, 2025Updated 8 months ago
RAIVNLab / MatFormer-OLMo
View on GitHub
Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…
☆31Nov 14, 2023Updated 2 years ago
Trustworthy-Information-Access / LLM-Knowledge-Boundary-Perception-via-Internal-States
View on GitHub
Official code for the paper Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception. The code is based on t…
☆22Aug 5, 2025Updated 11 months ago
MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 5 months ago
Yibin-Lei / MetaEOL
View on GitHub
Implementation for ACL 2024 paper "Meta-Task Prompting Elicits Embeddings from Large Language Models"
☆12Jul 25, 2024Updated 2 years ago
junhongmit / P-and-B
View on GitHub
🧠Plan-and-Budget: Training-free test-time reasoning framework for adaptive token allocation in large language models (ICLR 2026).
☆15Mar 2, 2026Updated 4 months ago
RamyaLab / pluralistic-alignment
View on GitHub
The open-source repository for PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment, which provides a general per…
☆17Aug 28, 2025Updated 10 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ICTMCG / SDTM
View on GitHub
Official repository for "Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration", which has been …
☆17Sep 29, 2025Updated 9 months ago
VainF / Remix-DiT
View on GitHub
☆18Dec 11, 2024Updated last year
LLM360 / k2-data-prep
View on GitHub
☆21Jun 4, 2024Updated 2 years ago
preddy5 / multi_implicit_fonts
View on GitHub
☆26Oct 20, 2022Updated 3 years ago
Compositionality / compositionality-latex-template
View on GitHub
The Compositionality article class.
☆14Mar 16, 2026Updated 4 months ago
tml-epfl / long-is-more-for-alignment
View on GitHub
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]
☆21May 2, 2024Updated 2 years ago
paolobrasolin / string-diagrams
View on GitHub
Create string diagrams with LaTeX!
☆14Jan 3, 2025Updated last year
GAIR-NLP / self-improvement-reversal
View on GitHub
☆13Jul 14, 2024Updated 2 years ago
HKUNLP / hkunlp.github.io
View on GitHub
Website for HKU NLP group (under construction)
☆14Jul 6, 2026Updated 3 weeks ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
NUSTM / LLMs-Waver-In-Judgments
View on GitHub
☆12Sep 23, 2024Updated last year
ImKeTT / AutoRec-Pytorch
View on GitHub
[Tool] AutoRec (2015) PyTorch Implementation
☆10Mar 1, 2020Updated 6 years ago
LaVi-Lab / AIM
View on GitHub
[ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"
☆65Oct 9, 2025Updated 9 months ago
assafbk / DeciMamba
View on GitHub
DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)
☆32Apr 9, 2025Updated last year
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
WU-CVGL / USB-NeRF
View on GitHub
[ICLR 2024] USB-NeRF: Unrolling Shutter Bundle Adjusted Neural Radiance Fields
☆14Mar 24, 2024Updated 2 years ago
nullromo / compression-accelerator
View on GitHub
Implementation of the Snappy compression algorithm as a RoCC accelerator
☆12Jul 29, 2019Updated 6 years ago