This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
☆282May 12, 2026Updated 2 weeks ago
Alternatives and similar repositories for presidio-research
Users that are interested in presidio-research are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data…☆8,243May 20, 2026Updated last week
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Jan 7, 2026Updated 4 months ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆49Jun 2, 2019Updated 6 years ago
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)☆99Feb 15, 2026Updated 3 months ago
- Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub☆342Jan 5, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.☆34Jul 26, 2020Updated 5 years ago
- Robust de-identification of medical notes using transformer architectures☆60Jun 27, 2022Updated 3 years ago
- Research simulation toolkit for federated learning☆13Nov 7, 2020Updated 5 years ago
- Unofficial Python client for Azure cognitive search☆11Jun 7, 2019Updated 6 years ago
- Finds linguistic patterns effortlessly☆39Aug 29, 2023Updated 2 years ago
- Library for identification, anonymization and de-anonymization of PII data☆22Dec 26, 2022Updated 3 years ago
- Generate reports for spaCy models.☆29May 27, 2022Updated 4 years ago
- ☆12Jun 25, 2024Updated last year
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆46Jan 1, 2026Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- aicreator for aidata☆14May 17, 2023Updated 3 years ago
- A Python module that provides multiple anonymization techniques for text (This is only a prototype) ➡️ The project has moved to: https://…☆26Mar 20, 2026Updated 2 months ago
- SpanMarker for Named Entity Recognition☆473Apr 10, 2026Updated last month
- In browser active learning and guided search☆17May 6, 2023Updated 3 years ago
- This is the implementation of the TextNAS algorithm proposed in the paper TextNAS: A Neural Architecture Search Space tailored for Text R…☆15Nov 28, 2022Updated 3 years ago
- The note taking app that doesn't suck☆16May 18, 2026Updated last week
- Fuzzy matching and more functionality for spaCy.☆258Jul 6, 2024Updated last year
- An NLP pipeline for COVID-19 surveillance used in the Department of Veterans Affairs Biosurveillance.☆15Oct 20, 2022Updated 3 years ago
- ☆18Jan 13, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…☆21Mar 20, 2026Updated 2 months ago
- A Python interface for NIH Reporter APIs☆12Feb 4, 2025Updated last year
- skweak: A software toolkit for weak supervision applied to NLP tasks☆927Sep 2, 2024Updated last year
- A spaCy wrapper for GliNER☆135Jan 29, 2025Updated last year
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆215Mar 12, 2026Updated 2 months ago
- Language detection using Spacy and Fasttext☆54Dec 17, 2023Updated 2 years ago
- Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts)☆3,210May 13, 2026Updated last week
- CLI for managing and generating Foundation Model prompts☆18Apr 14, 2026Updated last month
- Clean personally identifiable information from dirty dirty text using spaCy.☆41Sep 1, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This sample project shows off how to prepare and deploy to Azure Web Apps a simple Python web service with an image classifying model pro…☆26Feb 5, 2018Updated 8 years ago
- Demonstrate samples and good engineering practice for operationalizing machine learning solutions.☆20Dec 2, 2021Updated 4 years ago
- [DEPRECEATED] Piano Transformer model trained on 2.6GB of MIDI piano music☆13Oct 10, 2022Updated 3 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆105Apr 23, 2024Updated 2 years ago
- A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of lang…☆1,573Jun 12, 2025Updated 11 months ago
- ☆20Jul 24, 2024Updated last year
- spaCy pipeline object for negating concepts in text☆282Apr 20, 2026Updated last month