TalSchuster / CATs
Confident Adaptive Transformers
☆12Updated 3 years ago
Related projects: ⓘ
- ☆30Updated 8 months ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated last year
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Updated last year
- Pretraining summarization models using a corpus of nonsense☆13Updated 2 years ago
- Adding new tasks to T0 without catastrophic forgetting☆30Updated last year
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆71Updated 2 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆32Updated 2 years ago
- ☆13Updated this week
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆34Updated 9 months ago
- ☆18Updated last year
- ☆11Updated 2 years ago
- Influence Experiments☆36Updated last year
- Code and data of the EMNLP 2022 paper "Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Cli…☆12Updated last year
- Combining encoder-based language models☆11Updated 2 years ago
- ☆21Updated 3 years ago
- ☆12Updated 2 years ago
- ☆13Updated last year
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆50Updated 2 years ago
- Learning to Model Editing Processes☆26Updated 2 years ago
- ☆42Updated 4 years ago
- This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"☆28Updated 2 years ago
- Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics (AAAI 2022)☆14Updated 2 years ago
- source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conferenc…☆44Updated 6 months ago
- PyTorch Language Modeling Toolkit for Fast Weight Programmers☆16Updated last year
- ☆18Updated 3 months ago
- Variable-order CRFs with structure learning☆16Updated last month
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆45Updated 3 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Updated last year
- ☆11Updated last year
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆15Updated last year