dhfbk / KIND
KIND: an Italian Multi-Domain Dataset for Named Entity Recognition
☆15Updated last year
Alternatives and similar repositories for KIND:
Users that are interested in KIND are comparing it to the libraries listed below
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆12Updated last year
- ☆22Updated 3 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago
- Semantically Structured Sentence Embeddings☆65Updated 5 months ago
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 3 months ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆27Updated 7 months ago
- Automatically detect errors in annotated corpora.☆47Updated last year
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆51Updated last year
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆23Updated last month
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆19Updated 2 months ago
- The dataset and code for ACL 2022 paper "SciNLI: A Corpus for Natural Language Inference on Scientific Text" are released here.☆27Updated last year
- ☆38Updated 4 months ago
- ☆17Updated 2 years ago
- This repository contains code and data for the EMNLP 2022 paper "CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about…☆10Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated last year
- ☆64Updated 2 years ago
- ☆13Updated 6 months ago
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.☆24Updated last year
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 2 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 4 months ago
- The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.☆23Updated 9 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Updated last year
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Updated 3 years ago
- ☆26Updated last month
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 3 years ago
- ZS4IE: A Toolkit for Zero-Shot Information Extraction with Simple Verbalizations☆26Updated 3 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆86Updated this week