mt-upc / transformer-contributions
Measuring the Mixing of Contextual Information in the Transformer
β25Updated last year
Related projects β
Alternatives and complementary repositories for transformer-contributions
- Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 π πβ13Updated 7 months ago
- β15Updated 2 years ago
- β77Updated 7 months ago
- The geometry of multilingual language model representations (EMNLP 2022).β15Updated 2 years ago
- β17Updated 10 months ago
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modelingβ31Updated 3 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"β31Updated 2 years ago
- [NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformersβ21Updated last year
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)β29Updated 2 years ago
- β24Updated 3 years ago
- β24Updated 5 months ago
- Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained Language Models" paper.β32Updated last year
- β34Updated 6 months ago
- Benchmark API for Multidomain Language Modelingβ24Updated 2 years ago
- β14Updated 3 years ago
- β58Updated 2 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifiβ¦β30Updated last year
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper heβ¦β21Updated 2 weeks ago
- β20Updated 2 years ago
- Rationales for Sequential Predictionsβ40Updated 2 years ago
- β28Updated last year
- Influence Experimentsβ35Updated last year
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".β13Updated 3 years ago
- β37Updated 3 years ago
- β20Updated 3 years ago
- DiffusER: Discrete Diffusion via Edit-based Reconstruction (Reid, Hellendoorn & Neubig, 2022)β54Updated last year
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"β23Updated last month
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".β59Updated 4 years ago
- β33Updated 3 years ago
- DEMix Layers for Modular Language Modelingβ53Updated 3 years ago