TalSchuster / CATs
Confident Adaptive Transformers
☆12Updated 3 years ago
Alternatives and similar repositories for CATs:
Users that are interested in CATs are comparing it to the libraries listed below
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated 2 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- Variable-order CRFs with structure learning☆16Updated 7 months ago
- Combining encoder-based language models☆11Updated 3 years ago
- ☆13Updated last year
- CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing (ACL 2022)☆9Updated 2 years ago
- ☆44Updated 4 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Updated 2 years ago
- ☆11Updated 3 months ago
- This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paper☆14Updated 3 years ago
- A study of the downstream instability of word embeddings☆12Updated 2 years ago
- ☆20Updated 2 years ago
- Staged Training for Transformer Language Models☆32Updated 3 years ago
- ☆12Updated 3 years ago
- A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning☆15Updated 3 years ago
- ☆33Updated last year
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆72Updated 3 years ago
- Query-focused summarization data☆41Updated 2 years ago
- lanmt ebm☆11Updated 4 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆30Updated 3 years ago
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Updated 3 years ago
- Adding new tasks to T0 without catastrophic forgetting☆33Updated 2 years ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Updated last year
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆14Updated 2 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆60Updated 2 years ago
- ☆29Updated 2 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Updated 2 years ago
- PyTorch Implementation of NeurIPS 2020 paper "Learning Sparse Prototypes for Text Generation"☆22Updated 3 years ago
- Code for gradient rollback, which explains predictions of neural matrix factorization models, as for example used for knowledge base comp…☆21Updated 4 years ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Updated 3 years ago