GeorgeVern / smalaLinks
Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".
☆13Updated 4 years ago
Alternatives and similar repositories for smala
Users that are interested in smala are comparing it to the libraries listed below
Sorting:
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Updated 2 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- ☆25Updated last year
- ☆28Updated 10 months ago
- ☆25Updated 2 years ago
- ☆13Updated 4 years ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆58Updated 5 years ago
- ☆14Updated 4 years ago
- ☆11Updated 3 years ago
- This repo supports various cross-lingual transfer learning & multilingual NLP models.☆92Updated 2 years ago
- ☆58Updated 3 years ago
- PyTorch code for "FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization" (NAACL 2022)☆40Updated 3 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆27Updated 4 years ago
- Code for "BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition"☆32Updated 2 years ago
- This dataset contains human judgements about answer equivalence. The data is based on SQuAD (Stanford Question Answering Dataset), and co…☆27Updated 2 years ago
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆34Updated 3 years ago
- ☆92Updated 4 years ago
- ☆24Updated 2 years ago
- A software for transferring pre-trained English models to foreign languages☆19Updated 2 years ago
- ☆22Updated 2 years ago
- A Multi-subject High School Examinations Dataset for Cross-lingual and Multilingual Question Answering☆45Updated 3 years ago
- ☆14Updated 4 years ago
- Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"☆14Updated 3 years ago
- The Stanford Word Substitution (Swords) Benchmark☆32Updated 3 years ago
- Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation☆17Updated 4 years ago
- Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"☆20Updated 3 years ago
- Code for the paper "Factorising Meaning and Form for Intent-Preserving Paraphrasing", Tom Hosking & Mirella Lapata (ACL 2021)☆27Updated last year
- ☆19Updated 4 years ago
- ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…☆49Updated 4 years ago
- [COLING'22] Code for "Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments".☆61Updated 2 years ago