Annotated corpus + evaluation metrics for text anonymisation
☆71Jan 19, 2026Updated last month
Alternatives and similar repositories for text-anonymization-benchmark
Users that are interested in text-anonymization-benchmark are comparing it to the libraries listed below
Sorting:
- A software for transferring pre-trained English models to foreign languages☆19Mar 20, 2023Updated 2 years ago
- Re-ranking task using MS MARCO dataset and Hugging Face library☆15Jun 7, 2020Updated 5 years ago
- This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire…☆266Updated this week
- ☆17Jan 13, 2025Updated last year
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognition☆15Jun 28, 2023Updated 2 years ago
- Retter aktive nettside fra nynorsk til norsk (bokmål), for økt leseglede.☆21Sep 16, 2025Updated 5 months ago
- ☆14Apr 10, 2024Updated last year
- DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting☆15Apr 27, 2023Updated 2 years ago
- Presents an optimized Apache Beam pipeline for generating sentence embeddings (runnable on Cloud Dataflow).☆20Mar 7, 2022Updated 4 years ago
- ☆23Nov 15, 2019Updated 6 years ago
- ☆21Sep 21, 2021Updated 4 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Feb 26, 2024Updated 2 years ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- Legal document classification with EuroVoc descriptors on 22 languages.☆27Jun 10, 2023Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Aug 25, 2023Updated 2 years ago
- Edo Liberty's class notes form the course Algorithms in Data Mining given in Tel Aviv University in academic years 2011-2013☆26May 20, 2022Updated 3 years ago
- An EUR-Lex parser for Python.☆32Feb 27, 2026Updated last week
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆206Feb 24, 2026Updated last week
- ☆35Feb 22, 2026Updated last week
- ☆13Oct 5, 2025Updated 5 months ago
- ☆31May 26, 2021Updated 4 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- A python package for benchmarking interpretability techniques on Transformers.☆215Sep 29, 2024Updated last year
- models and evaluation framework for trending topics detection☆34Jun 18, 2024Updated last year
- A virtual caregiver system that extracts the expression of mental and physical health states through dialogue-based human-computer intera…☆14Jan 29, 2023Updated 3 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆88Sep 12, 2024Updated last year
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Sep 6, 2021Updated 4 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.☆41Apr 5, 2022Updated 3 years ago
- A library to synthesize text datasets using Large Language Models (LLM)☆152Jan 17, 2023Updated 3 years ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- Code for our EACL-2021 paper "Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs".☆38Jun 24, 2024Updated last year
- Active Learning for Text Classification in Python☆639Feb 1, 2026Updated last month
- Named entity recognition for the legal domain☆43Jun 1, 2021Updated 4 years ago
- Curated list of awesome datasets for various table understanding tasks☆18Sep 5, 2025Updated 6 months ago
- Introduction to Random Forest Algorithm for classification problem and how to select important feaatures in your dataset.☆12Aug 1, 2020Updated 5 years ago
- OpenRAG was developped by the innovation team at Meritis. The goal of OpenRAG is to provide an intuitive tool to help users decide which …☆30Feb 27, 2026Updated last week
- Analyse des Pegida facebook Korpus☆10Jan 31, 2015Updated 11 years ago
- Workshop on Text Classification at 1729 Conference☆13Sep 4, 2022Updated 3 years ago
- Introducción a la ciencia de datos y al aprendizaje automático☆10Nov 2, 2017Updated 8 years ago