NorskRegnesentral/text-anonymization-benchmark

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NorskRegnesentral/text-anonymization-benchmark)

NorskRegnesentral / text-anonymization-benchmark

Annotated corpus + evaluation metrics for text anonymisation

☆77

Alternatives and similar repositories for text-anonymization-benchmark

Users that are interested in text-anonymization-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NorskRegnesentral / NeuralTextSanitizer
View on GitHub
Neural models for detecting and masking personal information from texts
☆16Nov 25, 2022Updated 3 years ago
openredact / nerwhal
View on GitHub
This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…
☆21Mar 20, 2026Updated 4 months ago
eth-sri / SynthPAI
View on GitHub
A Synthetic Dataset for Personal Attribute Inference (NeurIPS'24 D&B)
☆58Jul 27, 2025Updated last year
alexa / ramen
View on GitHub
A software for transferring pre-trained English models to foreign languages
☆20Mar 20, 2023Updated 3 years ago
ltgoslo / NorQuAD
View on GitHub
Norwegian question answering dataset
☆15Feb 3, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Cour-de-cassation / moteurNER
View on GitHub
communication sur le moteur de pseudonymisation de la Cour de Cassation
☆19Feb 14, 2023Updated 3 years ago
xiangyue9607 / Sentence-LDP
View on GitHub
Code for the WWW'23 paper "Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy"
☆12Feb 20, 2023Updated 3 years ago
Knowledgator / RetriCo
View on GitHub
Efficient and modular GraphRAG system
☆50Jul 7, 2026Updated 2 weeks ago
data-privacy-stack / presidio-research
View on GitHub
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire…
☆300Updated this week
facebookresearch / romqa
View on GitHub
A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering
☆18Jan 7, 2023Updated 3 years ago
ssun32 / CLIRMatrix
View on GitHub
☆18Jul 23, 2021Updated 5 years ago
neuralmind-ai / coliee
View on GitHub
Code to reproduce NeuralMind's submissions to COLIEE 2021 and COLIEE 2022
☆24Jun 7, 2022Updated 4 years ago
NorskRegnesentral / skweak
View on GitHub
skweak: A software toolkit for weak supervision applied to NLP tasks
☆924Sep 2, 2024Updated last year
msang / hateval
View on GitHub
☆23Nov 15, 2019Updated 6 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
vmenger / deduce
View on GitHub
Deduce: de-identification method for Dutch medical text
☆69Feb 10, 2026Updated 5 months ago
dgaddy / parser-analysis
View on GitHub
☆22Apr 13, 2018Updated 8 years ago
AICGijon / quantificationlib
View on GitHub
QuantificationLib is an open-source library for quantification learning.
☆13Apr 6, 2024Updated 2 years ago
kyutai-labs / ARC-Encoder
View on GitHub
☆30Jan 5, 2026Updated 6 months ago
rashad101 / ELiDi
View on GitHub
This repository includes all the code and data for the paper ELiDi (End2end Entity Linking and Disambiguation)
☆14Jul 18, 2021Updated 5 years ago
jeffhj / LM_PersonalInfoLeak
View on GitHub
The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)
☆29Oct 31, 2022Updated 3 years ago
ddosecrets / pii_redaction_standard
View on GitHub
An attempt to develop standards for PII redaction.
☆17Mar 9, 2021Updated 5 years ago
Knowledgator / GLiClass
View on GitHub
Generalist and Lightweight Model for Text Classification
☆234Updated this week
cltl-students / verkijk_stella_rma_thesis_dutch_medical_language_model
View on GitHub
☆19Jan 13, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ljos / navnkjenner
View on GitHub
Named-Entity Recognition for Norwegian Bokmål and Nynorsk
☆12Aug 5, 2019Updated 6 years ago
machelreid / m2d2
View on GitHub
M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
☆54Nov 21, 2022Updated 3 years ago
StonyBrookNLP / PerSenT
View on GitHub
[COLING2020] A challenge dataset for Person SenTiment analysis in news domain.
☆11May 2, 2022Updated 4 years ago
compling-potsdam / misc-courses
View on GitHub
☆19Apr 22, 2026Updated 3 months ago
mainlp / CrossRE
View on GitHub
CrossRE: A Cross-Domain Dataset for Relation Extraction (Findings of EMNLP 2022)
☆49Aug 20, 2024Updated last year
microsoft / analysing_pii_leakage
View on GitHub
The repository contains the code for analysing the leakage of personally identifiable (PII) information from the output of next word pred…
☆104Aug 13, 2024Updated last year
DIPSAS / DockerBuildManagement
View on GitHub
Build Management is a python application, installed with pip. The application makes it easy to manage a build system based on Docker by c…
☆14Sep 22, 2021Updated 4 years ago
SrishtiGautam / ProtoVAE
View on GitHub
☆16Jun 8, 2023Updated 3 years ago
MantisAI / nervaluate
View on GitHub
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
☆221Mar 12, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
nihaljn / datahawk
View on GitHub
Viewer for text datasets in formats like HuggingFace, JSONL, etc.
☆15Feb 25, 2025Updated last year
ben-aaron188 / textwash
View on GitHub
☆36Feb 22, 2026Updated 5 months ago
DFKI-NLP / thermostat
View on GitHub
Collection of NLP model explanations and accompanying analysis tools
☆141Jun 26, 2023Updated 3 years ago
MichaelEinhorn / trl-textworld
View on GitHub
☆13May 7, 2023Updated 3 years ago
keyonvafa / sequential-rationales
View on GitHub
Rationales for Sequential Predictions
☆39Mar 10, 2022Updated 4 years ago
TideDancer / iclr21_isotropy_contxt
View on GitHub
☆33May 26, 2021Updated 5 years ago
AdityaLab / CAMul
View on GitHub
☆14Feb 18, 2022Updated 4 years ago