Aaronhuang-778 / Mixture-Compressor-MoE
View external linksLinks

[ICLR 2025] Mixture Compressor for Mixture-of-Experts LLMs Gains More

☆66

Alternatives and similar repositories for Mixture-Compressor-MoE

Users that are interested in Mixture-Compressor-MoE are comparing it to the libraries listed below

Sorting:

imagination-research / EEP
View on GitHub
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
☆23Nov 11, 2025Updated 3 months ago
CVMI-Lab / SyncOOD
View on GitHub
(ECCV 2024) Can OOD Object Detectors Learn from Foundation Models?
☆25Dec 7, 2024Updated last year
CVMI-Lab / ResKD
View on GitHub
[NeurIPS 2022] Official implementation of the paper "Rethinking Resolution in the Context of Efficient Video Recognition".
☆31Nov 16, 2022Updated 3 years ago
VincentDENGP / 3D-LR
View on GitHub
Can 3D Vision-Language Models Truly Understand Natural Language?
☆20Mar 28, 2024Updated last year
mit-han-lab / flash-moba
View on GitHub
☆221Nov 19, 2025Updated 2 months ago
LutingWang / HEAD
View on GitHub
HEtero-Assists Distillation for Heterogeneous Object Detectors
☆10Jul 3, 2023Updated 2 years ago
brendel-group / clip-ood
View on GitHub
Official code for the paper "Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?" (ICLR 2024)
☆10Aug 26, 2024Updated last year
wangitu / CherryQ
View on GitHub
☆14May 21, 2024Updated last year
duterscmy / CD-MoE
View on GitHub
Official PyTorch implementation of CD-MOE
☆12Mar 29, 2025Updated 10 months ago
tim-lawson / skip-middle
View on GitHub
Learning to Skip the Middle Layers of Transformers
☆17Aug 7, 2025Updated 6 months ago
UNITES-Lab / MoE-Quantization
View on GitHub
Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"
☆29Jun 30, 2025Updated 7 months ago
CVMI-Lab / clip-beyond-tail
View on GitHub
(NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights
☆28Oct 28, 2024Updated last year
CVMI-Lab / Hybrid-Occ-SDF
View on GitHub
This is the officially implementation of ICCV 2023 paper " Learning A Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with…
☆11Dec 7, 2023Updated 2 years ago
maifoundations / QZO
View on GitHub
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization
☆15Sep 17, 2025Updated 4 months ago
bytedance / AffineQuant
View on GitHub
Official implementation of the ICLR 2024 paper AffineQuant
☆28Mar 30, 2024Updated last year
pprp / Awesome-Efficient-MoE
View on GitHub
Efficient Mixture of Experts for LLM Paper List
☆166Sep 28, 2025Updated 4 months ago
EIT-NLP / SkipGPT
View on GitHub
[ICML 2025] Official implementation of the paper "SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling". …
☆19Nov 17, 2025Updated 2 months ago
CVMI-Lab / FS3D
View on GitHub
(NeurlPS 2022) Prototypical VoteNet for Few-Shot 3D Point Cloud Object Detection
☆60Jan 3, 2023Updated 3 years ago
microsoft / SeerAttention
View on GitHub
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
☆192Sep 23, 2025Updated 4 months ago
VAST-AI-Research / Deformable-Radial-Kernel-Splatting
View on GitHub
[CVPR 2025] Code for Deformable Radial Kernel Splatting
☆199May 20, 2025Updated 8 months ago
nasosger / MuToR
View on GitHub
[NeurIPS '25] Multi-Token Prediction Needs Registers
☆26Dec 14, 2025Updated 2 months ago
WujiangXu / EPO
View on GitHub
The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"
☆37Oct 1, 2025Updated 4 months ago
htqin / IR-QLoRA
View on GitHub
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆67Apr 15, 2024Updated last year
mit-han-lab / Quest
View on GitHub
[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
☆372Jul 10, 2025Updated 7 months ago
ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆172Nov 26, 2025Updated 2 months ago
Aaronhuang-778 / SliM-LLM
View on GitHub
[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆51Aug 9, 2024Updated last year
CVMI-Lab / IST-Net
View on GitHub
(ICCV2023) IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation
☆119Dec 7, 2023Updated 2 years ago
mdy666 / Qwen-Native-Sparse-Attention
View on GitHub
qwen-nsa
☆87Oct 14, 2025Updated 4 months ago
feifeibear / ChituAttention
View on GitHub
Quantized Attention on GPU
☆44Nov 22, 2024Updated last year
zjunlp / ModelKinship
View on GitHub
Exploring Model Kinship for Merging Large Language Models
☆27Apr 16, 2025Updated 9 months ago
XLearning-SCU / 2020-NeurIPS-CLEARER
View on GitHub
☆18Nov 14, 2020Updated 5 years ago
yangyifei729 / LaCo
View on GitHub
Official implementation for LaCo (EMNLP 2024 Findings)
☆21Oct 3, 2024Updated last year
jy-yuan / KIVI
View on GitHub
[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
☆356Nov 20, 2025Updated 2 months ago
Dao-AILab / grouped-latent-attention
View on GitHub
☆131May 29, 2025Updated 8 months ago
ZHITENGLI / ARB-LLM
View on GitHub
[ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models
☆28Aug 5, 2025Updated 6 months ago
Intelligent-Computing-Lab-Panda / TesseraQ
View on GitHub
☆25Oct 31, 2024Updated last year
shadowpa0327 / Palu
View on GitHub
[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection
☆155Feb 20, 2025Updated 11 months ago
IST-DASLab / MoE-Quant
View on GitHub
Code for data-aware compression of DeepSeek models
☆70Dec 11, 2025Updated 2 months ago
song-wx / SIFT
View on GitHub
[ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely
☆24Jun 26, 2024Updated last year

Aaronhuang-778 / Mixture-Compressor-MoEView external linksLinks

Alternatives and similar repositories for Mixture-Compressor-MoE

Aaronhuang-778 / Mixture-Compressor-MoE
View external linksLinks