Skhaki18 / optin-transformer-pruning
[ICLR 2024] The Need for Speed: Pruning Transformers with One Recipe
☆22Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for optin-transformer-pruning
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…☆50Updated 5 months ago
- PyTorch code for Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers☆34Updated 2 months ago
- BESA is a differentiable weight pruning technique for large language models.☆14Updated 8 months ago
- ☆7Updated last month
- [ECCV 2024] Isomorphic Pruning for Vision Models☆54Updated 3 months ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆40Updated last year
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching☆75Updated 4 months ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆68Updated 5 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆59Updated 7 months ago
- Are gradient information useful for pruning of LLMs?☆38Updated 7 months ago
- [ICLR'23] Trainability Preserving Neural Pruning (PyTorch)☆31Updated last year
- Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…☆37Updated 7 months ago
- TerDiT: Ternary Diffusion Models with Transformers☆62Updated 5 months ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆36Updated 8 months ago
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Mo…☆55Updated 3 months ago
- ☆20Updated last year
- Code for T-MARS data filtering☆35Updated last year
- Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)☆22Updated 8 months ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models☆89Updated 8 months ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Updated last year
- Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)☆17Updated 9 months ago
- Code for studying the super weight in LLM☆20Updated last week
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆37Updated last year
- ☆12Updated 5 months ago
- ☆15Updated 9 months ago
- [ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models☆16Updated 5 months ago
- ☆170Updated last month
- ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation☆34Updated 3 months ago
- [ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging tech…☆89Updated last year
- Towards Meta-Pruning via Optimal Transport, ICLR 2024 (Spotlight)☆12Updated 7 months ago