NorskRegnesentral / text-anonymization-benchmarkView external linksLinks
Annotated corpus + evaluation metrics for text anonymisation
☆71Jan 19, 2026Updated 3 weeks ago
Alternatives and similar repositories for text-anonymization-benchmark
Users that are interested in text-anonymization-benchmark are comparing it to the libraries listed below
Sorting:
- Neural models for detecting and masking personal information from texts☆16Nov 25, 2022Updated 3 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Apr 25, 2024Updated last year
- A software for transferring pre-trained English models to foreign languages☆19Mar 20, 2023Updated 2 years ago
- Code for the WWW'23 paper "Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy"☆12Feb 20, 2023Updated 2 years ago
- ☆17Jan 13, 2025Updated last year
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognition☆15Jun 28, 2023Updated 2 years ago
- ☆14Apr 10, 2024Updated last year
- DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting☆15Apr 27, 2023Updated 2 years ago
- Presents an optimized Apache Beam pipeline for generating sentence embeddings (runnable on Cloud Dataflow).☆20Mar 7, 2022Updated 3 years ago
- A web interface to understand language-specific BERT-models☆18Apr 16, 2024Updated last year
- ☆18Jul 23, 2021Updated 4 years ago
- E-NER: Evidential Deep Learning for Trustworthy Named Entity Recognition☆25Jul 18, 2023Updated 2 years ago
- CrossRE: A Cross-Domain Dataset for Relation Extraction (Findings of EMNLP 2022)☆49Aug 20, 2024Updated last year
- Open source text annotation software created by the french supreme court 'Cour de cassation'☆24Dec 9, 2025Updated 2 months ago
- ☆23Nov 15, 2019Updated 6 years ago
- Code for Findings-ACL 2023 paper: Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Rec…☆47Jun 3, 2024Updated last year
- Collection of NLP model explanations and accompanying analysis tools☆144Jun 26, 2023Updated 2 years ago
- ☆21Sep 21, 2021Updated 4 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Feb 26, 2024Updated last year
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- Legal document classification with EuroVoc descriptors on 22 languages.☆27Jun 10, 2023Updated 2 years ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆27Apr 21, 2023Updated 2 years ago
- Edo Liberty's class notes form the course Algorithms in Data Mining given in Tel Aviv University in academic years 2011-2013☆26May 20, 2022Updated 3 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Aug 25, 2023Updated 2 years ago
- An EUR-Lex parser for Python.☆32Jun 27, 2024Updated last year
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆204Sep 6, 2025Updated 5 months ago
- ☆35Updated this week
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- ☆84Feb 15, 2023Updated 3 years ago
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆35Aug 2, 2023Updated 2 years ago
- CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system☆76Dec 9, 2022Updated 3 years ago
- 🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models☆12May 30, 2025Updated 8 months ago
- A virtual caregiver system that extracts the expression of mental and physical health states through dialogue-based human-computer intera…☆14Jan 29, 2023Updated 3 years ago
- M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis☆13Nov 24, 2025Updated 2 months ago
- On Generating Extended Summaries of Long Documents☆78Jan 26, 2021Updated 5 years ago
- Java implementation of the EbMS 2.0 specification.☆10Updated this week
- A library to synthesize text datasets using Large Language Models (LLM)☆152Jan 17, 2023Updated 3 years ago
- Code and Data for Evaluation WG☆42May 4, 2022Updated 3 years ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago