ZunhaiSu/Super-Experts-Profilling

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZunhaiSu/Super-Experts-Profilling)

ZunhaiSu / Super-Experts-Profilling

(ICLR 2026) Unveiling Super Experts in Mixture-of-Experts Large Language Models

☆44

Alternatives and similar repositories for Super-Experts-Profilling

Users that are interested in Super-Experts-Profilling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MrZilinXiao / ProxyThinker
View on GitHub
[ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.
☆22Sep 24, 2025Updated 10 months ago
foreverlasting1202 / QuestA
View on GitHub
☆22Jan 2, 2026Updated 6 months ago
tianyi-lab / RoMA
View on GitHub
Code for "Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs"
☆19Nov 6, 2025Updated 8 months ago
pprp / Awesome-Efficient-MoE
View on GitHub
Efficient Mixture of Experts for LLM Paper List
☆184Jun 16, 2026Updated last month
shangshang-wang / Resa
View on GitHub
Resa: Transparent Reasoning Models via SAEs
☆50Sep 23, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
BryceZhuo / HybridNorm
View on GitHub
The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
☆19Mar 7, 2025Updated last year
YUECHE77 / SPIN
View on GitHub
[EMNLP 2025 Main Conference] Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression
☆16Dec 26, 2025Updated 6 months ago
jinhangzhan / RL_Heals_SFT
View on GitHub
☆21Mar 22, 2026Updated 4 months ago
alexmartin1722 / wikivideo
View on GitHub
WikiVideo: Article Generation from Multiple Videos
☆15Nov 14, 2025Updated 8 months ago
UNITES-Lab / MoE-Quantization
View on GitHub
Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"
☆31Jun 30, 2025Updated last year
montehoover / DynaGuard
View on GitHub
Code for "DynaGuard: A Dynamic Guardrail Model With User-Defined Policies."
☆23Nov 3, 2025Updated 8 months ago
soyoung97 / AcuRank
View on GitHub
☆15Jul 30, 2025Updated 11 months ago
bigai-nlco / RouterLens
View on GitHub
[EMNLP 2025] RouterLens
☆29Sep 15, 2025Updated 10 months ago
qiuzh20 / RMoE
View on GitHub
Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
☆33Aug 4, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Jiahao004 / DeepTheorem
View on GitHub
☆27Jun 10, 2025Updated last year
Yifei-Y / Openset-RCNN
View on GitHub
This repository contains the code for the IEEE Robotics and Automation Letters paper "Open-Set Object Detection Using Classification-Free…
☆16Dec 6, 2023Updated 2 years ago
Xtra-Computing / LLM-DNA
View on GitHub
[ICLR'26 Oral] LLM DNA: Tracing Model Evolution via Functional Representations
☆32Apr 8, 2026Updated 3 months ago
UNITES-Lab / MC-SMoE
View on GitHub
[ICLR‘24 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
☆108Jun 20, 2025Updated last year
hanningzhang / ER-PRM
View on GitHub
☆20Dec 14, 2024Updated last year
HypherX / Evolution-Analysis
View on GitHub
☆25Dec 13, 2024Updated last year
Zanette-Labs / speed-rl
View on GitHub
☆18Feb 2, 2026Updated 5 months ago
inclusionAI / MoBE
View on GitHub
Mixture-of-Basis-Experts for Compressing MoE-based LLMs
☆37Dec 24, 2025Updated 7 months ago
czp16 / Bridge-LLM-reasoning
View on GitHub
Behavior Injection: Preparing Language Models for Reinforcement Learning (NeurIPS 2025)
☆17Jul 1, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yaof20 / Flash-RL
View on GitHub
Implementation for FP8/INT8 Rollout for RL training without performence drop.
☆306Nov 7, 2025Updated 8 months ago
UNITES-Lab / C2R-MoE
View on GitHub
[NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…
☆16Feb 4, 2025Updated last year
idiap / dialog2flow
View on GitHub
Dialog2Flow: convert your dialogs to flows. This repository accompanies the paper "Dialog2Flow: Pre-training Soft-Contrastive Sentence Em…
☆20Jul 1, 2025Updated last year
lliai / D2MoE
View on GitHub
D^2-MoE: Delta Decompression for MoE-based LLMs Compression
☆82Mar 25, 2025Updated last year
cvsp-lab / AgilePruner
View on GitHub
[ICLR 2026] AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models
☆28Mar 3, 2026Updated 4 months ago
yushuiwx / MH-MoE
View on GitHub
☆20Nov 5, 2024Updated last year
AngelaZZZ-611 / reasoning_models_probing
View on GitHub
☆22May 14, 2026Updated 2 months ago
meituan-longcat / R-HORIZON
View on GitHub
[ICLR'26] R-HORIZON: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
☆27May 9, 2026Updated 2 months ago
ipsitmantri / DiTASK
View on GitHub
DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations (CVPR 2025)
☆14Jun 1, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
OpenMOSS / Lorsa
View on GitHub
☆30Nov 9, 2025Updated 8 months ago
slm-mux / SLM-MUX
View on GitHub
☆25Mar 26, 2026Updated 4 months ago
vincentamato / mlx-esm-2
View on GitHub
An MLX implementation of Meta AI's ESM-2 protein language model
☆16Aug 16, 2025Updated 11 months ago
ars22 / e3
View on GitHub
☆20Sep 16, 2025Updated 10 months ago
belindal / state-tracking
View on GitHub
Code and data for paper "(How) do Language Models Track State?"
☆26Mar 31, 2025Updated last year
youngwanLEE / holisafe
View on GitHub
[CVPR Findings 2026] HoliSafe: Holistic Safety Benchmarking and Modeling for Vision-Language Model
☆17Mar 8, 2026Updated 4 months ago
zzh-SJTU / CRT-QA
View on GitHub
The official data and code for EMNLP 2023 main conference paper: CRT-QA: A Dataset of Complex Reasoning Question Answering over Tabular D…
☆13May 19, 2025Updated last year