bltlab / paranames
ParaNames: A multilingual resource for parallel names
☆30Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for paranames
- A survey of corpora for Germanic low-resource languages and dialects☆24Updated 3 months ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 3 years ago
- LTG-Bert☆29Updated 10 months ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆31Updated 2 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆37Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆29Updated last month
- Code for SaGe subword tokenizer (EACL 2023)☆22Updated this week
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆82Updated last month
- Multilingual Open Text☆25Updated 3 weeks ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆30Updated last year
- Evaluation code and data for "Automatic Correction of Human Translations" [NAACL 2022].☆19Updated last year
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆21Updated last year
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆25Updated 4 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.☆34Updated 2 years ago
- Neural models for detecting and masking personal information from texts☆14Updated last year
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆56Updated last year
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆21Updated 2 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆50Updated last year
- ☆73Updated 3 years ago
- XL-AMR is a sequence-to-graph cross-lingual AMR parser that exploits transfer learning (EMNLP2020).☆16Updated 3 months ago
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆63Updated last year
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Updated 3 years ago
- Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.☆30Updated last year
- Data for the HIPE 2022 shared task.☆16Updated 11 months ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆11Updated last year
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆21Updated last month
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆67Updated 3 years ago
- ☆64Updated last year
- Code for the CRAC 2021 paper "On Generalization in Coreference Resolution" (Best short paper award)☆34Updated last year
- A python module for evaluating NERC and NEL system performances as defined in the HIPE shared tasks (formerly CLEF-HIPE-2020-scorer).☆13Updated 5 months ago