Unofficial implementation of https://arxiv.org/pdf/2407.14679
☆53Sep 7, 2024Updated last year
Alternatives and similar repositories for Compact-Language-Models-via-Pruning-and-Knowledge-Distillation
Users that are interested in Compact-Language-Models-via-Pruning-and-Knowledge-Distillation are comparing it to the libraries listed below
Sorting:
- Learn LangChain for Genearative AI with OpenAI API using Python☆11Feb 15, 2024Updated 2 years ago
- (ECCV2022) EAGAN: EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs☆12Sep 15, 2022Updated 3 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 11 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"☆12Oct 14, 2025Updated 4 months ago
- KDSS is the framework for knowledge distillation from LLMs☆12Nov 5, 2025Updated 3 months ago
- A tiny easily hackable implementation of a feature dashboard.☆15Oct 21, 2025Updated 4 months ago
- D^2-MoE: Delta Decompression for MoE-based LLMs Compression☆72Mar 25, 2025Updated 11 months ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆41Sep 9, 2025Updated 5 months ago
- [ICCV W] Contextual Convolutional Neural Networks (https://arxiv.org/pdf/2108.07387.pdf)☆14Aug 18, 2021Updated 4 years ago
- Benchmarks for Macro Neural Architecture Search; used and described in the paper "Local Search is a Remarkably Strong Baseline for Neural…☆12Jul 25, 2024Updated last year
- ☆20Aug 16, 2021Updated 4 years ago
- The reproduce for "AM-LFS: AutoML for Loss Function Search"☆14May 20, 2020Updated 5 years ago
- [NeurIPS 2024] Search for Efficient LLMs☆16Jan 16, 2025Updated last year
- ViT architecture with Mamba instead of transformer backbone☆18Dec 8, 2023Updated 2 years ago
- An implementation of <Group Fisher Pruning for Practical Network Compression> based on pytorch and mmcv☆18Nov 21, 2021Updated 4 years ago
- GRAIN: Gradient-based Intra-attention Pruning on Pre-trained Language Models☆19Jul 12, 2023Updated 2 years ago
- This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs☆43Aug 14, 2024Updated last year
- ☆26Apr 12, 2022Updated 3 years ago
- Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity (ACL 2025, oral)☆30Jun 14, 2025Updated 8 months ago
- Easy-to-use and flexible AutoML library for Python☆28Jan 24, 2026Updated last month
- ☆73Dec 16, 2025Updated 2 months ago
- MLPNAS code for Paperspace series on Neural Architecture Search☆23May 29, 2023Updated 2 years ago
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆29Jun 30, 2025Updated 8 months ago
- ☆22Mar 22, 2025Updated 11 months ago
- teeth segmentation using pytorch and monai☆25Mar 23, 2023Updated 2 years ago
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆37Aug 20, 2024Updated last year
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆67Mar 27, 2025Updated 11 months ago
- Efficient Mixture of Experts for LLM Paper List☆168Sep 28, 2025Updated 5 months ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆33May 9, 2024Updated last year
- R Ultimate 2023 - R for Data Science and Machine Learning, by Packt Publishing☆15Dec 15, 2025Updated 2 months ago
- [ICCV-2023] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization☆28Dec 6, 2023Updated 2 years ago
- ☆40Nov 22, 2025Updated 3 months ago
- The official implementation of paper PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search☆31Sep 5, 2023Updated 2 years ago
- Encodings for neural architecture search☆29Apr 5, 2021Updated 4 years ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34May 28, 2025Updated 9 months ago
- ☆29Jun 7, 2024Updated last year
- In the code, we presents attention module architecture searching for nuclei semantic segmentation and classification.☆35Jun 29, 2025Updated 8 months ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆30Mar 28, 2024Updated last year