Layer-wise Pruning of Transformer Heads for Efficient Language Modeling
☆22Feb 22, 2022Updated 4 years ago
Alternatives and similar repositories for Attention-Head-Pruning
Users that are interested in Attention-Head-Pruning are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023] Token-Scaled Logit Distillation for Ternary Weight Generative Language Models☆18Dec 6, 2023Updated 2 years ago
- This repo investigates LLMs' tendency to exhibit acquiescence bias in sequential QA interactions. Includes evaluation methods, datasets, …☆49Sep 23, 2025Updated 5 months ago
- ☆12Jan 31, 2025Updated last year
- FastTrack4LLM 是一个为大模型学习者准备的大模型学习与实践框架,帮助他们轻松掌握大模型的核心原理与训练流程,让每个人都能真正理解大模型的内部机制。本项目不仅完整复现了 LLaMA、Qwen、DeepSeek 等主流开源大模型架构,还覆盖了大模型的全生命周期:To…☆24Nov 6, 2025Updated 4 months ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization☆13Aug 3, 2024Updated last year
- Minimalistic REST API for wake-on-lan☆10Nov 1, 2017Updated 8 years ago
- The officalimplement of dLLM-Factory☆26Jul 12, 2025Updated 7 months ago
- Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification☆10May 31, 2022Updated 3 years ago
- ☆13Apr 25, 2025Updated 10 months ago
- Immutable development environments for PyTorch powered by Visual Studio Code Dev Containers☆11Feb 15, 2023Updated 3 years ago
- Revamped: Hugo+LoveIt☆10Feb 28, 2026Updated last week
- ☆19Nov 1, 2025Updated 4 months ago
- PyTorch implementation of "Nextformer: A ConvNeXt Augmented Conformer For End-To-End Speech Recognition"☆11Dec 15, 2022Updated 3 years ago
- A LaTeX template for Bachelor or Master theses☆12Jun 10, 2022Updated 3 years ago
- 😎 Awesome papers on token redundancy reduction☆11Mar 12, 2025Updated 11 months ago
- This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs☆43Aug 14, 2024Updated last year
- ☆48Aug 7, 2023Updated 2 years ago
- [ICML2024] "FedLMT: Tackling System Heterogeneity of Federated Learning via Low-Rank Model Training with Theoretical Guarantees" by Jiaha…☆14Sep 22, 2024Updated last year
- The official implementation of "Federated Learning with Label-Masking Distillation"☆11Oct 28, 2023Updated 2 years ago
- Official implementation of DapperFL.☆13Oct 29, 2024Updated last year
- [CVPR2023] Practical Network Acceleration with Tiny Sets☆14Jul 28, 2023Updated 2 years ago
- ☆10Oct 15, 2019Updated 6 years ago
- Easy-to-use Retrieval-Enhanced Transformer implementation☆10Sep 30, 2022Updated 3 years ago
- [ACL 2025] DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues☆26Jul 10, 2025Updated 7 months ago
- final-project-level3-nlp-02 created by GitHub Classroom☆11Dec 31, 2021Updated 4 years ago
- ☆10Nov 22, 2022Updated 3 years ago
- Demos of neural image editing☆11Mar 15, 2021Updated 4 years ago
- ☆12Feb 23, 2023Updated 3 years ago
- mixedbread ai python sdk☆12Jul 1, 2024Updated last year
- Wake on LAN reverse proxy☆11Sep 15, 2023Updated 2 years ago
- ☆14Oct 12, 2024Updated last year
- ☆12Dec 26, 2024Updated last year
- Source Code for KDD'19 paper "SurfCon: Synonym Discovery on Privacy-Aware Clinical Data"☆10Apr 10, 2020Updated 5 years ago
- Official repository of the paper "InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection"☆15Nov 13, 2025Updated 3 months ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- Code for Adaptive Deep Neural Network Inference Optimization with EENet☆12Mar 28, 2024Updated last year
- Learning and rediscovering ML from total scratch☆12Aug 30, 2021Updated 4 years ago
- Machine Learning Toolbox 2☆13Nov 22, 2025Updated 3 months ago