dhfbk / KINDLinks
KIND: an Italian Multi-Domain Dataset for Named Entity Recognition
☆15Updated 2 years ago
Alternatives and similar repositories for KIND
Users that are interested in KIND are comparing it to the libraries listed below
Sorting:
- Automatically detect errors in annotated corpora.☆47Updated 2 years ago
- A survey of corpora for Germanic low-resource languages and dialects☆25Updated 10 months ago
- ☆10Updated last year
- A software for transferring pre-trained English models to foreign languages☆19Updated 2 years ago
- Semantically Structured Sentence Embeddings☆67Updated 11 months ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 4 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆41Updated 3 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆54Updated 2 years ago
- This repository provides the source code used to automatically generate the book summarization datasets described in the paper titled "Ec…☆11Updated 5 months ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆88Updated 4 months ago
- ☆27Updated 7 months ago
- (NAACL 2024) Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations☆14Updated 5 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆61Updated last year
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 3 years ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆28Updated last year
- ☆13Updated 3 years ago
- ☆75Updated 4 years ago
- ☆53Updated last year
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆84Updated last year
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Updated 3 years ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆68Updated 2 years ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆49Updated 3 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆27Updated 4 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- ☆22Updated 3 years ago
- A spaCy custom component that extracts and normalizes temporal expressions☆55Updated 2 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 2 years ago
- GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)☆18Updated last year
- ☆15Updated 2 years ago
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆91Updated 2 months ago