lliai/D2MoE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lliai/D2MoE)

lliai / D2MoE

D^2-MoE: Delta Decompression for MoE-based LLMs Compression

☆82

Alternatives and similar repositories for D2MoE

Users that are interested in D2MoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pprp / Pruner-Zero
View on GitHub
[ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
☆100Nov 25, 2024Updated last year
pprp / STBLLM
View on GitHub
[ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
☆20Jun 3, 2025Updated last year
BIT-DA / ABS
View on GitHub
[ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection
☆27Jun 27, 2025Updated last year
ZHITENGLI / ARB-LLM
View on GitHub
[ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models
☆31Aug 5, 2025Updated 11 months ago
imagination-research / EEP
View on GitHub
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
☆25Nov 11, 2025Updated 8 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Aaronhuang-778 / Mixture-Compressor-MoE
View on GitHub
[ICLR 2025, IEEE TPAMI 2026] Mixture Compressor & MC#
☆75Feb 12, 2025Updated last year
luuyin / OWL
View on GitHub
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
☆81Jul 7, 2025Updated last year
inclusionAI / MoBE
View on GitHub
Mixture-of-Basis-Experts for Compressing MoE-based LLMs
☆37Dec 24, 2025Updated 7 months ago
GiantAILab / DeepSound-V1
View on GitHub
Official code for DeepSound-V1
☆12May 14, 2025Updated last year
ModelTC / HarmoniCa
View on GitHub
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching i…
☆45Jul 10, 2025Updated last year
wazenmai / HC-SMoE
View on GitHub
[ICML 2025] Retraining-Free Merging of Sparse MoE via Hierarchical Clustering
☆25Oct 26, 2025Updated 8 months ago
XIANGLONGYAN / PBS2P
View on GitHub
PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"
☆13Jul 11, 2026Updated last week
RuiHuangNUS / MARS-Reconfig
View on GitHub
[ICRA 2025]Robust Self-Reconfiguration for Fault-Tolerant Control of Modular Aerial Robot Systems
☆28Jun 9, 2025Updated last year
JL-Cheng / SERE
View on GitHub
[ICLR 2026] SERE: Similarity-Based Expert Re-routing for Efficient Batch Decoding in MoE Models
☆18Feb 4, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Lucky-Lance / Expert_Sparsity
View on GitHub
[ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
☆123May 24, 2024Updated 2 years ago
yangyifei729 / LaCo
View on GitHub
Official implementation for LaCo (EMNLP 2024 Findings)
☆22Oct 3, 2024Updated last year
GATECH-EIC / ShiftAddViT
View on GitHub
[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Dec 6, 2023Updated 2 years ago
haoyang9804 / Erwin
View on GitHub
A random Solidity program generator.
☆134Jan 4, 2026Updated 6 months ago
SempraETY / Pruning-via-Merging
View on GitHub
☆23Nov 26, 2024Updated last year
Ther-nullptr / circult-eda-mlsys-tinyml-arxiv-daily
View on GitHub
🎓Automatically Update circult-eda-mlsys-tinyml Papers Daily using Github Actions (Update Every 8th hours)
☆10Updated this week
UNITES-Lab / C2R-MoE
View on GitHub
[NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…
☆16Feb 4, 2025Updated last year
cornell-zhang / llm-datatypes
View on GitHub
Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
☆27Jun 25, 2024Updated 2 years ago
NiceRingNode / PartialConvolution
View on GitHub
A non-official re-implementation of article "[ECCV 18] Image Inpainting for Irregular Holes Using Partial Convolutions"
☆12Mar 1, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
lliai / Auto-Prox-AAAI24
View on GitHub
Auto-Prox-AAAI24
☆14Apr 30, 2024Updated 2 years ago
IST-DASLab / Sparse-Marlin
View on GitHub
Boosting 4-bit inference kernels with 2:4 Sparsity
☆96Sep 4, 2024Updated last year
MILab-PKU / MVAR
View on GitHub
Offical implementation of "Auto-Regressively Generating Multi-View Consistent Images". (ICCV 2025)
☆88Jul 26, 2025Updated 11 months ago
SNU-ARC / DecDEC
View on GitHub
[OSDI 2025] DecDEC: A Systems Approach to Advancing Low‑Bit LLM Quantization
☆26Jan 29, 2026Updated 5 months ago
JaydenLyh / SmPO
View on GitHub
[ICML 2025] Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
☆30Jun 29, 2025Updated last year
pprp / CVPR2022-NAS-competition-Track1-3th-solution
View on GitHub
Implementation of PGONAS for CVPR22W and RD-NAS for ICASSP23
☆23Apr 25, 2023Updated 3 years ago
lliai / DetKDS
View on GitHub
[ICML2024] DetKDS: Knowledge Distillation Search for Object Detectors
☆19Jul 11, 2024Updated 2 years ago
NiceRingNode / LGGPT
View on GitHub
[IJCV 2025] Smaller But Better: Unifying Layout Generation with Smaller Large Language Models
☆158Aug 3, 2025Updated 11 months ago
GiantAILab / DeepDubber-V1
View on GitHub
DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning…
☆30Sep 7, 2025Updated 10 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Susan571 / LENSLLM
View on GitHub
This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉
☆26May 29, 2025Updated last year
Qualcomm-AI-research / gptvq
View on GitHub
☆42Mar 28, 2024Updated 2 years ago
shaohao011 / MedCCO
View on GitHub
[ACM MM2026] This is the official implementation of MedCCO
☆17Jul 12, 2026Updated last week
RainBowLuoCS / OpenOmni
View on GitHub
(NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…
☆142May 9, 2026Updated 2 months ago
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
lliai / Teacher-free-Distillation
View on GitHub
TF-FD
☆20Nov 19, 2022Updated 3 years ago
XIANGLONGYAN / Awesome-Visual-Autoregressive-Modeling
View on GitHub
This repository collects Visual Autoregressive (VAR) modeling papers from 2024 to 2026 published at top-tier conferences, as well as rele…
☆21Mar 13, 2026Updated 4 months ago