dhfbk / KIND
KIND: an Italian Multi-Domain Dataset for Named Entity Recognition
☆15Updated last year
Alternatives and similar repositories for KIND:
Users that are interested in KIND are comparing it to the libraries listed below
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆22Updated last month
- ☆22Updated 2 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆83Updated this week
- Automatically detect errors in annotated corpora.☆47Updated last year
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆26Updated 5 months ago
- The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.☆23Updated 7 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- A survey of corpora for Germanic low-resource languages and dialects☆24Updated 2 months ago
- Semantically Structured Sentence Embeddings☆66Updated 3 months ago
- An easy-to-use API for analyzing INCEpTION annotation projects.☆16Updated last year
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆68Updated 3 years ago
- Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also pred…☆70Updated 2 years ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆47Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆79Updated 7 months ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆51Updated last year
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated 10 months ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆86Updated last month
- ☆74Updated 3 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated last month
- ☆17Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations…☆19Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 8 months ago
- A software for transferring pre-trained English models to foreign languages☆18Updated last year
- Corpus exploration platform using advanced tools such as interactive summarization and multi document coreference resolution☆12Updated last year
- The dataset and code for ACL 2022 paper "SciNLI: A Corpus for Natural Language Inference on Scientific Text" are released here.☆27Updated last year
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆29Updated 3 months ago
- Noise-robust de-duplication at scale☆16Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated last week
- Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"☆15Updated last year