NorskRegnesentral / text-anonymization-benchmark
Annotated corpus + evaluation metrics for text anonymisation
☆48Updated 7 months ago
Related projects: ⓘ
- A spaCy custom component that extracts and normalizes temporal expressions☆53Updated last year
- ☆73Updated 3 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆56Updated last year
- Explainable Zero-Shot Topic Extraction☆62Updated last month
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆62Updated 4 months ago
- ☆24Updated 8 months ago
- Automatically detect errors in annotated corpora.☆45Updated last year
- Code for equipping pretrained language models (BART, GPT-2, XLNet) with commonsense knowledge for generating implicit knowledge statement…☆16Updated 3 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆39Updated 2 years ago
- Data programming by demonstration for information extraction and span annotation☆35Updated 3 years ago
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging☆65Updated 2 years ago
- Data Programming by Demonstration (DPBD) for Document Classification☆36Updated 3 years ago
- RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!☆41Updated last year
- A embed able annotation tool for end to end cross document co-reference☆41Updated last year
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆38Updated 7 months ago
- ☆82Updated 3 weeks ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆149Updated 3 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆79Updated 2 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆80Updated 3 weeks ago
- A High-level Library for Named Entity Recognition in Python.☆23Updated 9 months ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆40Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆72Updated 2 months ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 7 months ago
- ☆64Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆99Updated 4 months ago
- Shared code for training sentence embeddings with Flax / JAX☆27Updated 3 years ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer models☆63Updated last year
- Dataset containing scroll interactions of 598 partcipants reading advanced and elementary texts from the OneStopEnglish corpus☆15Updated 2 years ago
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- Collection of NLP model explanations and accompanying analysis tools☆143Updated last year