lena-voita/the-story-of-heads

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lena-voita/the-story-of-heads)

lena-voita / the-story-of-heads

This is a repository with the code for the ACL 2019 paper "Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned" and the ACL 2021 paper "Analyzing Source and Target Contributions to NMT Predictions".

☆324

Alternatives and similar repositories for the-story-of-heads

Users that are interested in the-story-of-heads are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pmichel31415 / are-16-heads-really-better-than-1
View on GitHub
Code for the paper "Are Sixteen Heads Really Better than One?"
☆175Apr 1, 2020Updated 6 years ago
yilinyang7 / fairseq_multi_fix
View on GitHub
Code and Data release for "Improving Multilingual Translation by Representation and Gradient Regularization" (Yang et al. EMNLP 2021), an…
☆13Aug 12, 2024Updated last year
clarkkev / attention-analysis
View on GitHub
☆474Apr 4, 2021Updated 5 years ago
lena-voita / description-length-probing
View on GitHub
This is a repository with the code for the EMNLP 2020 paper "Information-Theoretic Probing with Minimum Description Length"
☆75Aug 20, 2024Updated last year
Noahs-ARK / MAE
View on GitHub
☆21May 5, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lena-voita / good-translation-wrong-in-context
View on GitHub
This is a repository with the data and code for the ACL 2019 paper "When a Good Translation is Wrong in Context: ..." and the EMNLP 2019 …
☆101May 12, 2020Updated 6 years ago
xiye17 / EvalQAExpl
View on GitHub
Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.
☆17Apr 25, 2021Updated 5 years ago
KaiserWhoLearns / Effective-Attention-Interpretability
View on GitHub
Effective Attention Sheds Light On Interpretability - Findings of ACL2021
☆11May 16, 2021Updated 5 years ago
mt-upc / transformer-contributions
View on GitHub
Measuring the Mixing of Contextual Information in the Transformer
☆35May 27, 2023Updated 3 years ago
gorokoba560 / norm-analysis-of-transformer
View on GitHub
☆87Apr 16, 2024Updated 2 years ago
travel-go / Abstractive-Text-Summarization
View on GitHub
Contrastive Attention Mechanism for Abstractive Text Summarization
☆40Jan 14, 2020Updated 6 years ago
deep-spin / hallucinations-in-nmt
View on GitHub
☆20Jan 16, 2024Updated 2 years ago
mt-upc / transformer-contributions-nmt
View on GitHub
☆18Oct 6, 2022Updated 3 years ago
MadhumithaKannan / linear-regression-using-only-numpy
View on GitHub
Implementation of unregularized, l1 regularized and l2 regularized linear regression using numpy and without sklearn
☆11Oct 4, 2019Updated 6 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
akashkm99 / Interpretable-Attention
View on GitHub
Official Code for Towards Transparent and Explainable Attention Models paper (ACL 2020)
☆36Jun 22, 2022Updated 4 years ago
TevenLeScao / pet
View on GitHub
This repository contains the code for "How many data points is a prompt worth?"
☆48Apr 7, 2021Updated 5 years ago
teslacool / SCA
View on GitHub
Soft Contextual Data Augmentation
☆39Jul 25, 2024Updated last year
emorynlp / seq2seq-corenlp
View on GitHub
☆13Feb 7, 2023Updated 3 years ago
boknilev / nmt-repr-analysis
View on GitHub
☆38Apr 23, 2019Updated 7 years ago
MurathanKurfali / Ted-MDB-Annotations
View on GitHub
☆16Jan 14, 2022Updated 4 years ago
VITA-Group / BERT-Tickets
View on GitHub
[NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…
☆141Dec 30, 2021Updated 4 years ago
LiyuanLucasLiu / Transformer-Clinic
View on GitHub
Understanding the Difficulty of Training Transformers
☆332May 31, 2022Updated 4 years ago
jungokasai / deep-shallow
View on GitHub
☆43Sep 16, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ZurichNLP / understanding-mbr
View on GitHub
☆17Apr 28, 2022Updated 4 years ago
jessevig / bertviz
View on GitHub
BertViz: Visualize Attention in Transformer Models
☆8,129Jan 8, 2026Updated 6 months ago
neulab / compare-mt
View on GitHub
A tool for holistic analysis of language generations systems
☆471Sep 22, 2025Updated 9 months ago
facebookresearch / adaptive-span
View on GitHub
Transformer training code for sequential tasks
☆610Sep 14, 2021Updated 4 years ago
neulab / word-embeddings-for-nmt
View on GitHub
Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018
☆123Sep 22, 2025Updated 9 months ago
neulab / lrlm
View on GitHub
Code for the paper "Latent Relation Language Models" at AAAI-20.
☆41Sep 22, 2025Updated 9 months ago
shuo-git / VecConstNMT
View on GitHub
☆25Oct 22, 2022Updated 3 years ago
stanis-morozov / prodige
View on GitHub
A supplementary code for Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs.
☆47Nov 2, 2019Updated 6 years ago
timvieira / vocrf
View on GitHub
Variable-order CRFs with structure learning
☆17Aug 1, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AndreasMadsen / nlp-roar-interpretability
View on GitHub
Measuring if attention is explanation with ROAR
☆22Mar 3, 2023Updated 3 years ago
tm4roon / survey
View on GitHub
Survey on machine learning.
☆14Nov 28, 2020Updated 5 years ago
ictnlp / NA-MNMT
View on GitHub
Source code for "Importance-based Neuron Allocation for Multilingual Neural Machine Translation"
☆12Sep 15, 2021Updated 4 years ago
nelson-liu / inoculation-by-finetuning
View on GitHub
Code for the paper "Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets", to be presented at NAACL 2019.
☆21Apr 4, 2019Updated 7 years ago
adapter-hub / adapters
View on GitHub
A Unified Library for Parameter-Efficient and Modular Transfer Learning
☆2,822Apr 26, 2026Updated 2 months ago
sameenmaruf / selective-attn
View on GitHub
Data and code used in our NAACL'19 paper "Selective Attention for Context-aware Neural Machine Translation"
☆30Apr 12, 2020Updated 6 years ago
nyu-mll / jiant
View on GitHub
jiant is an nlp toolkit
☆1,675Jul 6, 2023Updated 3 years ago