NorskRegnesentral / text-anonymization-benchmark
Annotated corpus + evaluation metrics for text anonymisation
β55Updated last year
Alternatives and similar repositories for text-anonymization-benchmark:
Users that are interested in text-anonymization-benchmark are comparing it to the libraries listed below
- π« SpaCy wrapper for ConceptNet π«β92Updated last year
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.β106Updated 11 months ago
- A spaCy custom component that extracts and normalizes temporal expressionsβ54Updated 2 years ago
- A Python library aimed at dissecting and augmenting NER training data.β58Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β108Updated 10 months ago
- Explainable Zero-Shot Topic Extractionβ62Updated 7 months ago
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to iβ¦β46Updated 11 months ago
- Few-shot Named Entity Recognitionβ123Updated 3 years ago
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Taggingβ66Updated 2 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β40Updated 3 years ago
- A High-level Library for Named Entity Recognition in Python.β23Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.β153Updated 10 months ago
- β75Updated 3 years ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer modelsβ65Updated 2 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkβ80Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated last year
- A library to synthesize text datasets using Large Language Models (LLM)β151Updated 2 years ago
- A python package for benchmarking interpretability techniques on Transformers.β212Updated 6 months ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.β51Updated last year
- Semantically Structured Sentence Embeddingsβ65Updated 5 months ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)β48Updated 3 years ago
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020β62Updated 11 months ago
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.β102Updated 2 years ago
- β85Updated last week
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEvalβ13β171Updated 2 weeks ago
- Source code and data for Like a Good Nearest Neighborβ28Updated 3 months ago
- Automatically detect errors in annotated corpora.β47Updated last year
- Neural models for detecting and masking personal information from textsβ15Updated 2 years ago
- Creating class-based TF-IDF matricesβ83Updated 2 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.β68Updated 3 years ago