Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".
☆47Jul 12, 2024Updated last year
Alternatives and similar repositories for ExCP
Users that are interested in ExCP are comparing it to the libraries listed below
Sorting:
- ☆27Jul 11, 2024Updated last year
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆67Mar 27, 2025Updated 11 months ago
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆16Feb 15, 2025Updated last year
- ☆51Mar 2, 2024Updated 2 years ago
- ☆12Sep 1, 2023Updated 2 years ago
- This the implementation of LeCo☆31Jan 20, 2025Updated last year
- ☆15Sep 24, 2023Updated 2 years ago
- Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE Acceleration with Zero Computation Redundancy"☆15Mar 6, 2025Updated 11 months ago
- ☆13Jun 26, 2024Updated last year
- [SIGIR 2024] This is the official PyTorch implementation for the paper: "EulerFormer: Sequential User Behavior Modeling with Complex Vect…☆17Oct 5, 2024Updated last year
- The reproduce for "AM-LFS: AutoML for Loss Function Search"☆14May 20, 2020Updated 5 years ago
- [NeurIPS 2024] Search for Efficient LLMs☆16Jan 16, 2025Updated last year
- SuperCLUE高考作文机器自动阅卷系统☆17Jun 8, 2023Updated 2 years ago
- [EMNLP 2023 Industry Track] A simple prompting approach that enables the LLMs to run inference in batches.☆77Mar 8, 2024Updated last year
- Pytorch code for paper: Full-Stack Filters to Build Minimum Viable CNNs☆16Sep 10, 2019Updated 6 years ago
- ☆20Nov 3, 2024Updated last year
- [ICLR 2024] Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks☆46Feb 20, 2024Updated 2 years ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆53Aug 28, 2024Updated last year
- ☆23Nov 26, 2024Updated last year
- The evaluation framework for the InfiCoder-Eval benchmark.☆21Jul 22, 2024Updated last year
- The code for the paper "QuAFL: Federated Averaging Can Be Both Asynchronous and Communication-Efficient"☆17Mar 26, 2023Updated 2 years ago
- ☆26Nov 10, 2025Updated 3 months ago
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆20Dec 10, 2024Updated last year
- [ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs☆98Nov 25, 2024Updated last year
- [ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models☆28Aug 5, 2025Updated 6 months ago
- ☆88Jun 7, 2024Updated last year
- Source code for the paper "LongGenBench: Long-context Generation Benchmark"☆24Oct 8, 2024Updated last year
- Low-bit optimizers for PyTorch☆138Oct 9, 2023Updated 2 years ago
- Official Code for Dataset Distillation using Neural Feature Regression (NeurIPS 2022)☆48Nov 12, 2022Updated 3 years ago
- ☆25Aug 23, 2024Updated last year
- Pruning the VLLMs☆106Dec 9, 2024Updated last year
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- [ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.☆27Apr 21, 2025Updated 10 months ago
- Accommodating Large Language Model Training over Heterogeneous Environment.☆25Mar 13, 2025Updated 11 months ago
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆28May 28, 2024Updated last year
- [arXiv 2024] I4VGen: Image as Free Stepping Stone for Text-to-Video Generation☆24Oct 6, 2024Updated last year
- ☆31Jun 12, 2024Updated last year
- Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs☆27Jun 25, 2024Updated last year
- ☆34Nov 26, 2025Updated 3 months ago