gorokoba560 / norm-analysis-of-transformer
☆77Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for norm-analysis-of-transformer
- ☆28Updated 3 years ago
- Measuring the Mixing of Contextual Information in the Transformer☆25Updated last year
- ☆24Updated 3 years ago
- code associated with ACL 2021 DExperts paper☆113Updated last year
- Faithfulness and factuality annotations of XSum summaries from our paper "On Faithfulness and Factuality in Abstractive Summarization" (h…☆81Updated 3 years ago
- ☆103Updated 2 years ago
- On Explaining Your Explanations of BERT: An Empirical Study with Sequence Classification☆30Updated last year
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- ☆42Updated 10 months ago
- ☆42Updated 3 years ago
- Rationales for Sequential Predictions☆40Updated 2 years ago
- ☆42Updated last year
- This is a repository with the code for the EMNLP 2020 paper "Information-Theoretic Probing with Minimum Description Length"☆69Updated 3 months ago
- ☆20Updated 4 years ago
- Official Code for the papers: "Controlled Text Generation as Continuous Optimization with Multiple Constraints" and "Gradient-based Const…☆59Updated 8 months ago
- Automatic metrics for GEM tasks☆61Updated 2 years ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆59Updated 4 years ago
- Code and datasets for the EMNLP 2020 paper "Calibration of Pre-trained Transformers"☆57Updated last year
- Code for Editing Factual Knowledge in Language Models☆135Updated 2 years ago
- ☆37Updated 3 years ago
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆31Updated 3 years ago
- ☆95Updated 2 years ago
- ☆87Updated 2 years ago
- Code accompanying our papers on the "Generative Distributional Control" framework☆117Updated last year
- ☆58Updated 2 years ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆84Updated 2 years ago
- ☆33Updated 3 years ago
- Implementation for https://arxiv.org/abs/2005.00652☆27Updated last year
- Pytorch implementation of DiffMask☆55Updated last year
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆85Updated 3 years ago