NorskRegnesentral / text-anonymization-benchmark
Annotated corpus + evaluation metrics for text anonymisation
β55Updated last year
Alternatives and similar repositories for text-anonymization-benchmark:
Users that are interested in text-anonymization-benchmark are comparing it to the libraries listed below
- A High-level Library for Named Entity Recognition in Python.β23Updated last year
- A Python library aimed at dissecting and augmenting NER training data.β58Updated last year
- π« SpaCy wrapper for ConceptNet π«β92Updated last year
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated last year
- A spaCy custom component that extracts and normalizes temporal expressionsβ54Updated 2 years ago
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to iβ¦β46Updated last year
- Explainable Zero-Shot Topic Extractionβ62Updated 8 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β108Updated 11 months ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β40Updated 3 years ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.β106Updated last year
- β76Updated 3 years ago
- Neural models for detecting and masking personal information from textsβ15Updated 2 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)β48Updated 3 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/β86Updated last week
- A python package for benchmarking interpretability techniques on Transformers.β212Updated 7 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β153Updated 11 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatioβ¦β44Updated last year
- A software for transferring pre-trained English models to foreign languagesβ18Updated 2 years ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"β48Updated 2 years ago
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Taggingβ66Updated 3 years ago
- A library to synthesize text datasets using Large Language Models (LLM)β152Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doβ¦β80Updated 10 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkβ80Updated 3 years ago
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEvalβ13β176Updated last month
- Automatically detect errors in annotated corpora.β47Updated last year
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.β45Updated last year
- Collection of NLP model explanations and accompanying analysis toolsβ145Updated last year
- Shared code for training sentence embeddings with Flax / JAXβ27Updated 3 years ago
- RATransformers π- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!β41Updated 2 years ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2β¦β67Updated 2 years ago