Official PyTorch implementation of CD-MOE
☆12Mar 18, 2026Updated this week
Alternatives and similar repositories for CD-MoE
Users that are interested in CD-MoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs☆23Nov 11, 2025Updated 4 months ago
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆16Feb 15, 2025Updated last year
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated 11 months ago
- Official Pytorch Implementation of "Outlier-weighed Layerwise Sampling for LLM Fine-tuning" by Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei …☆35Jun 3, 2025Updated 9 months ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"☆81Jul 7, 2025Updated 8 months ago
- Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"☆12Oct 14, 2025Updated 5 months ago
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Jul 31, 2023Updated 2 years ago
- [Neurips 2021] Sparse Training via Boosting Pruning Plasticity with Neuroregeneration☆31Feb 11, 2023Updated 3 years ago
- [ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization☆25Oct 5, 2025Updated 5 months ago
- NN1 network from FaceNet: A Unified Embedding for Face Recognition and Clustering, in Keras.☆11Jun 13, 2017Updated 8 years ago
- This is the official code for UGTs.☆13Feb 8, 2023Updated 3 years ago
- [ICML 2021] "Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training" by Shiwei Liu, Lu Yin, De…☆45Nov 11, 2023Updated 2 years ago
- Code for testing DCT plus Sparse (DCTpS) networks☆14Jun 15, 2021Updated 4 years ago
- ☆17Dec 9, 2024Updated last year
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆30Jun 30, 2025Updated 8 months ago
- Comparison of method "Pruning at initialization prior to training" (Synflow/SNIP/GraSP) in PyTorch☆17May 12, 2024Updated last year
- D^2-MoE: Delta Decompression for MoE-based LLMs Compression☆73Mar 25, 2025Updated 11 months ago
- [ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models☆28Aug 5, 2025Updated 7 months ago
- [ICCV 2021] Code release for "Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks"☆32Jul 24, 2022Updated 3 years ago
- python越南语分词器☆10Nov 14, 2019Updated 6 years ago
- Randomized algorithm class at CU☆15Jul 8, 2025Updated 8 months ago
- The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".☆89Feb 28, 2026Updated 3 weeks ago
- ☆42Oct 31, 2024Updated last year
- Image retrieval with triplet loss☆17Jun 20, 2018Updated 7 years ago
- ☆30Jul 22, 2024Updated last year
- Basic image segmentation & compression using K-Means clustering☆14Feb 28, 2019Updated 7 years ago
- Code and Data for the ACL21 paper "Modeling Bilingual Conversational Characteristics for Neural Chat Translation"☆12Dec 17, 2021Updated 4 years ago
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"☆13Jun 7, 2023Updated 2 years ago
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆38Aug 20, 2024Updated last year
- Spatial Spectral Machine Learning☆14Oct 15, 2025Updated 5 months ago
- ☆10Mar 2, 2024Updated 2 years ago
- [ICLR 2022] "Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity" by Shiwei Liu,…☆27Jun 15, 2022Updated 3 years ago
- Code for reproducing "AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks" (NeurIPS 2021)☆23Nov 9, 2021Updated 4 years ago
- ☆16Sep 27, 2023Updated 2 years ago
- ☆12Jul 6, 2022Updated 3 years ago
- ☆14Feb 2, 2021Updated 5 years ago
- Official Pytorch Implementation of Unsupervised Representation Learning for Binary Networks by Joint Classifier Training (CVPR 2022)☆11Apr 10, 2022Updated 3 years ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- [ICLR 2025, IEEE TPAMI 2026] Mixture Compressor & MC#☆69Feb 12, 2025Updated last year