Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"
☆12Oct 14, 2025Updated 4 months ago
Alternatives and similar repositories for SMS
Users that are interested in SMS are comparing it to the libraries listed below
Sorting:
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 11 months ago
- ☆16Sep 27, 2023Updated 2 years ago
- Official implementation of "OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging".☆43Oct 30, 2025Updated 4 months ago
- Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs☆23Nov 11, 2025Updated 3 months ago
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆29Jun 30, 2025Updated 8 months ago
- ☆29Nov 29, 2023Updated 2 years ago
- Feature Fusion for Online Mutual Knowledge Distillation Code☆27Jul 21, 2020Updated 5 years ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆33Mar 5, 2024Updated last year
- ☆10May 30, 2018Updated 7 years ago
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated last year
- Common template for pytorch project. Easy to extent and modify for new project.☆13Dec 13, 2022Updated 3 years ago
- Showing full TensorBoard support in Tensorflow for a CNN using MNIST data.☆13Oct 19, 2019Updated 6 years ago
- Fastai+PyTorch implementation of sparse model training methods (SET, SNFS, RigL) + customize-your-own.☆10Oct 20, 2022Updated 3 years ago
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- Storage of everything interesting we can find in web (IMHO)☆10Aug 5, 2025Updated 6 months ago
- ☆12Jun 9, 2025Updated 8 months ago
- 2019~2021年间Zero-shot/Data-free知识蒸馏的论文合集☆11Sep 8, 2021Updated 4 years ago
- JPEG编解码从零开始实现(python JPEG codec)☆10Jul 29, 2022Updated 3 years ago
- ChineseCLIP using online learning☆13Nov 7, 2022Updated 3 years ago
- ☆11Jun 30, 2018Updated 7 years ago
- ☆10Oct 28, 2019Updated 6 years ago
- 研究生课程笔记。包含组合数学、高级算法设计与分析、最优化理论与应用、大数据分析与挖掘。☆15Dec 17, 2023Updated 2 years ago
- Object detection method that can simultaneously estimate the positions and depth of the objects from images☆11Feb 14, 2022Updated 4 years ago
- Use pretrained BERT model to automatically generate grammar multiple choice questions (MCQ) from any news article or story.☆13Oct 2, 2019Updated 6 years ago
- (ECCV2022) EAGAN: EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs☆12Sep 15, 2022Updated 3 years ago
- [AAAI-25 Oral] Adaptive Calibration☆14Jul 6, 2025Updated 7 months ago
- codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients☆10May 27, 2021Updated 4 years ago
- Learn to code for NLP☆10Jul 20, 2020Updated 5 years ago
- Deploy Detectron2 with Triton inference server☆10Aug 15, 2022Updated 3 years ago
- Nano vLLM☆12Jun 26, 2025Updated 8 months ago
- Symbolic Graphics Programming with Large Language Models☆37Sep 14, 2025Updated 5 months ago
- ☆10Mar 2, 2024Updated last year
- Statistics and Visualization of acceptance rate, main keyword of CVPR 2023 accepted papers for the main Computer Vision conference (CVPR)☆12May 4, 2023Updated 2 years ago
- Official implementation of "Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent".☆21May 23, 2025Updated 9 months ago
- Vectorgraph Image Painter☆12Mar 24, 2019Updated 6 years ago
- ☆10Oct 7, 2019Updated 6 years ago
- This project shows how to analyze the influences of object characteristics on detection performance and the frequency and impact of diffe…☆41May 4, 2017Updated 8 years ago
- Evolve diffusion models by merging.☆13Jun 15, 2024Updated last year
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆14Feb 4, 2025Updated last year