gorokoba560 / norm-analysis-of-transformer
☆73Updated 5 months ago
Related projects: ⓘ
- ☆27Updated 3 years ago
- Measuring the Mixing of Contextual Information in the Transformer☆24Updated last year
- ☆24Updated 3 years ago
- code associated with ACL 2021 DExperts paper☆109Updated last year
- ☆57Updated 2 years ago
- ☆20Updated 3 years ago
- Code and datasets for the EMNLP 2020 paper "Calibration of Pre-trained Transformers"☆55Updated last year
- Code for paper "CrossFit : A Few-shot Learning Challenge for Cross-task Generalization in NLP" (https://arxiv.org/abs/2104.08835)☆102Updated 2 years ago
- ☆100Updated 2 years ago
- ☆31Updated 2 years ago
- Code accompanying our papers on the "Generative Distributional Control" framework☆117Updated last year
- This is a repository with the code for the EMNLP 2020 paper "Information-Theoretic Probing with Minimum Description Length"☆68Updated 3 weeks ago
- ☆41Updated 3 years ago
- DEMix Layers for Modular Language Modeling☆51Updated 3 years ago
- Automatic metrics for GEM tasks☆61Updated last year
- ☆62Updated 4 years ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆59Updated 4 years ago
- ☆49Updated last year
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.☆68Updated last month
- ☆42Updated 7 months ago
- Faithfulness and factuality annotations of XSum summaries from our paper "On Faithfulness and Factuality in Abstractive Summarization" (h…☆80Updated 3 years ago
- Code for Editing Factual Knowledge in Language Models☆134Updated 2 years ago
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆31Updated 2 years ago
- Rationales for Sequential Predictions☆40Updated 2 years ago
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆71Updated 2 years ago
- Pytorch implementation of DiffMask☆55Updated last year
- Semantic parsers based on encoder-decoder framework☆90Updated last year
- ☆63Updated 2 years ago
- ☆94Updated 2 years ago
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆80Updated 3 years ago