chandar-lab / EfficientLLMs
☆17Updated 9 months ago
Alternatives and similar repositories for EfficientLLMs
Users that are interested in EfficientLLMs are comparing it to the libraries listed below
Sorting:
- ACL 2023☆39Updated last year
- ☆51Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆26Updated last year
- Flexible simulator for mixed precision and format simulation of LLMs and vision transformers.☆49Updated last year
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆46Updated last year
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆28Updated last year
- [ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.☆24Updated 3 weeks ago
- Unit Scaling demo and experimentation code☆16Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆41Updated last year
- ☆37Updated 8 months ago
- Low-Rank Llama Custom Training☆22Updated last year
- Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs☆81Updated 5 months ago
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆36Updated 7 months ago
- The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…☆47Updated 2 years ago
- ☆24Updated 6 months ago
- Scaling Sparse Fine-Tuning to Large Language Models☆16Updated last year
- ☆25Updated last year
- ☆12Updated 8 months ago
- Here we will test various linear attention designs.☆60Updated last year
- ☆15Updated last year
- ☆36Updated 6 months ago
- ☆28Updated 9 months ago
- [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models☆49Updated last year
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆51Updated 2 years ago
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆58Updated last month
- ☆54Updated 2 weeks ago
- GPU operators for sparse tensor operations☆32Updated last year
- Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM☆14Updated last year
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆66Updated 6 months ago
- [ICML 2022] "Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets" by Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wa…☆32Updated 2 years ago