lvyufeng / easy_mindspore_bk
☆18Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for easy_mindspore_bk
- [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models☆37Updated 10 months ago
- ☆14Updated last year
- A Tight-fisted Optimizer☆47Updated last year
- The official implementation of the ICML 2023 paper OFQ-ViT☆27Updated last year
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆35Updated 8 months ago
- [KDD'22] Learned Token Pruning for Transformers☆93Updated last year
- mindspore implementation of transformers☆65Updated last year
- The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…☆46Updated 2 years ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆42Updated last year
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"☆11Updated last year
- Implementation of Denoising Diffusion Probabilistic Model in MindSpore☆32Updated last year
- ☆35Updated 2 years ago
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆36Updated 3 weeks ago
- Official PyTorch implementation of IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact☆32Updated 5 months ago
- ☆59Updated 4 months ago
- Summary of system papers/frameworks/codes/tools on training or serving large model☆56Updated 10 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆102Updated last year
- [ICLR 2022] "Unified Vision Transformer Compression" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Li…☆48Updated 11 months ago
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers☆167Updated last year
- This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.☆84Updated last year
- # Unified Normalization (ACM MM'22) By Qiming Yang, Kai Zhang, Chaoxiang Lan, Zhi Yang, Zheyang Li, Wenming Tan, Jun Xiao, and Shiliang P…☆34Updated last year
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆81Updated 5 months ago
- ☆13Updated last week
- Official PyTorch implementation of FlatQuant: Flatness Matters for LLM Quantization☆59Updated this week
- Lion and Adam optimization comparison☆56Updated last year
- The official implementation of the NeurIPS 2022 paper Q-ViT.☆82Updated last year
- Source code for IJCAI 2022 Long paper: Parameter-Efficient Sparsity for Large Language Models Fine-Tuning.☆13Updated 2 years ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆191Updated last year
- Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models☆184Updated 6 months ago
- PyTorch codes for "LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning"☆230Updated last year