valentinhofmann / superbizarre
Code and data for "Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words"
☆15Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for superbizarre
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆30Updated last year
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆31Updated 2 years ago
- Statistics on multilingual datasets☆17Updated 2 years ago
- A survey of corpora for Germanic low-resource languages and dialects☆24Updated 3 months ago
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆21Updated last year
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- ☆73Updated 3 years ago
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyi…☆14Updated 2 years ago
- ☆31Updated last year
- Library for experimenting with state-of-the-art evaluation metrics like UScore☆11Updated last year
- This is a repository for the paper on testing inductive bias with scaled-down RoBERTa models.☆19Updated 2 years ago
- How Contextual are Contextualized Word Representations?☆39Updated 4 years ago
- Code and CoarseWSD-20 datasets for "Language Models and Word Sense Disambiguation: An Overview and Analysis"☆23Updated 2 years ago
- Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages (ACL 2022)☆18Updated 2 years ago
- [EMNLP 2020] Collective HumAn OpinionS on Natural Language Inference Data☆33Updated 2 years ago
- Pytorch Seq2Seq framework☆26Updated last month
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆76Updated 7 months ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆11Updated last year
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Updated 3 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆82Updated last month
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆31Updated 3 years ago
- This repository hosts the code for a tokenizer of tweets.☆12Updated 5 years ago
- ☆14Updated 4 years ago
- A repository for experiments in quality-aware decoding☆15Updated 2 years ago
- [COLING 2022]: CommunityLM: Probing Partisan Worldviews from Language Models☆13Updated last year
- The Benchmark of Linguistic Minimal Pairs☆142Updated last year
- Data and code for Kang et al., EMNLP 2019's paper titled "(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Ann…☆29Updated 4 years ago
- Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages☆13Updated 4 years ago