☆20Dec 16, 2020Updated 5 years ago
Alternatives and similar repositories for maskbert
Users that are interested in maskbert are comparing it to the libraries listed below
Sorting:
- Code for the paper "REV: Information-Theoretic Evaluation of Free-Text Rationales"☆16Aug 11, 2023Updated 2 years ago
- IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization☆12Nov 23, 2021Updated 4 years ago
- ☆14Apr 8, 2021Updated 4 years ago
- “Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition” (EMNLP 2022)☆16Feb 2, 2023Updated 3 years ago
- AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION☆14Oct 27, 2023Updated 2 years ago
- Code for running the experiments in Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT☆17Aug 15, 2023Updated 2 years ago
- Understanding attention for text classification☆17Nov 27, 2020Updated 5 years ago
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆18Nov 26, 2023Updated 2 years ago
- [Neurips 2022] “ Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropogation”, Ziyu Jiang*, Xuxi Chen*, Xueqin Huan…☆19Mar 14, 2023Updated 2 years ago
- ☆20Mar 30, 2022Updated 3 years ago
- Source code for the ACL-IJCNLP 2021 paper entitled "T-DNA: Taming Pre-trained Language Models with N-gram Representations for Low-Resourc…☆19Jan 12, 2023Updated 3 years ago
- Deep Weighted Averaging Classifiers☆23Feb 4, 2019Updated 7 years ago
- [NeurIPS 2022] "A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models", Yuanxin Liu, Fandong Meng, Zheng Lin, Jiangnan Li…☆21Jan 9, 2024Updated 2 years ago
- [NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…☆15Oct 18, 2022Updated 3 years ago
- Codebase for running (conditional) probing experiments☆22Nov 13, 2022Updated 3 years ago
- Code for ACL2022 publication Transkimmer: Transformer Learns to Layer-wise Skim☆22Aug 21, 2022Updated 3 years ago
- ☆20Oct 12, 2021Updated 4 years ago
- ☆24Sep 2, 2024Updated last year
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆25Nov 6, 2023Updated 2 years ago
- The Codebase for Causal Distillation for Language Models (NAACL '22)☆26May 1, 2022Updated 3 years ago
- Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)☆28Mar 26, 2022Updated 3 years ago
- This is the public github for our paper "Transformer with a Mixture of Gaussian Keys"☆28Aug 13, 2022Updated 3 years ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆32Jan 7, 2026Updated last month
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 3 years ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Oct 27, 2023Updated 2 years ago
- Code and data for "Retrieval Enhanced Model for Commonsense Generation" (ACL-IJCNLP 2021).☆29Dec 31, 2021Updated 4 years ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆64Jul 30, 2023Updated 2 years ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.☆30Nov 5, 2021Updated 4 years ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆33Jun 2, 2023Updated 2 years ago
- ☆88Dec 5, 2021Updated 4 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Aug 6, 2023Updated 2 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Jun 9, 2021Updated 4 years ago
- Source code for EMNLP 2020 paper "Coreferential Reasoning Learning for Language Representation"☆69Sep 6, 2022Updated 3 years ago
- Parameter Efficient Transfer Learning with Diff Pruning☆75Feb 3, 2021Updated 5 years ago
- Code of paper Exploring Task Difficulty for Few-Shot Relation Extraction. https://arxiv.org/abs/2109.05473☆35Sep 12, 2021Updated 4 years ago
- Code for our ACL2021 paper: "Check It Again: Progressive Visual Question Answering via Visual Entailment"☆31Nov 24, 2021Updated 4 years ago
- Code for the paper "SMACE: A New Method for the Interpretability of Composite Decision Systems", ECML 2022☆15Apr 17, 2023Updated 2 years ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆39Jun 11, 2025Updated 8 months ago