shakeley / ELBERT

☆10

Related projects: ⓘ

twinkle0331 / Xcompression
[ICLR 2022] Code for paper "Exploring Extreme Parameter Compression for Pre-trained Language Models"(https://arxiv.org/abs/2205.10036)
☆19Updated last year
hikaru-nara / DASK
This is the official implmentation of Domain-Adaptive Text Classification with Structured Knowledge from Unlabeled Data (IJCAI 2022 Long …
☆11Updated last year
yuchaoli / PST
Source code for IJCAI 2022 Long paper: Parameter-Efficient Sparsity for Large Language Models Fine-Tuning.
☆13Updated 2 years ago
rlin27 / DeBut
Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.
☆12Updated 2 years ago
yaozhewei / MLPruning
MLPruning, PyTorch, NLP, BERT, Structured Pruning
☆21Updated 3 years ago
sanagno / adaptively_sparse_attention
☆16Updated last year
lancopku / DCKD
Code and data for Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction (ECML-PKDD 22)
☆11Updated 2 years ago
renll / SparseLT
[EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing
☆15Updated last year
GATECH-EIC / Linearized-LLM
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆22Updated 3 months ago
ThomasScialom / T0_continual_learning
Adding new tasks to T0 without catastrophic forgetting
☆30Updated last year
yifanycc / loretta
[NAACL 24 Oral] LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models
☆19Updated 3 weeks ago
frankaging / Causal-Distill
The Codebase for Causal Distillation for Language Models (NAACL '22)
☆25Updated 2 years ago
schwartz-lab-NLP / papa
Code for the PAPA paper
☆27Updated last year
Noahs-ARK / RFA
☆32Updated 3 years ago
amodaresi / AdapLeR
☆22Updated last year
rycolab / differentiable-subset-pruning
☆15Updated 3 years ago
VITA-Group / Random-MoE-as-Dropout
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…
☆42Updated last year
facebookresearch / Ternary_Binary_Transformer
ACL 2023
☆38Updated last year
castorini / berxit
☆21Updated 3 years ago
VITA-Group / Structure-LTH
[ICML 2022] "Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets" by Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wa…
☆30Updated last year
ducdauge / sft-llm
Scaling Sparse Fine-Tuning to Large Language Models
☆17Updated 7 months ago
yangalan123 / FineTuningStability
Code and data of the EMNLP 2022 paper "Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Cli…
☆12Updated last year
shawnricecake / EdgeQAT
Official Repo for EdgeQAT
☆12Updated 6 months ago
QingruZhang / PLATON
This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).
☆39Updated last year
htqin / BiBERT
This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.
☆81Updated last year
wjxts / RegularizedBN
☆20Updated last year
Raincleared-Song / ConPET
Source code for a LoRA-based continual relation extraction method.
☆10Updated 11 months ago
LorrinWWW / SkipBERT
Code associated with the paper **SkipBERT: Efficient Inference with Shallow Layer Skipping**, at ACL 2022
☆15Updated 2 years ago
VITA-Group / llm-kick
[ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.
☆15Updated 6 months ago
kssteven418 / SqueezeLLM-gradients
☆14Updated 7 months ago