This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
☆288May 24, 2026Updated 3 weeks ago
Alternatives and similar repositories for presidio-research
Users that are interested in presidio-research are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data…☆8,599Jun 7, 2026Updated last week
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Jan 7, 2026Updated 5 months ago
- A CLI for identifying potential Personally Identifiable Information in datasets.☆14Apr 9, 2019Updated 7 years ago
- Annotated corpus + evaluation metrics for text anonymisation☆74Jan 19, 2026Updated 4 months ago
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)☆99Feb 15, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub☆345Jan 5, 2024Updated 2 years ago
- A project to build a machine learning pipeline to detect personal identifiable information (PII)☆16Dec 8, 2022Updated 3 years ago
- ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.☆34Jul 26, 2020Updated 5 years ago
- Robust de-identification of medical notes using transformer architectures☆62Jun 27, 2022Updated 3 years ago
- Research simulation toolkit for federated learning☆13Nov 7, 2020Updated 5 years ago
- Unofficial Python client for Azure cognitive search☆11Jun 7, 2019Updated 7 years ago
- Generate reports for spaCy models.☆29May 27, 2022Updated 4 years ago
- An AI-powered Personal Identifiable Information (PII) scanner.☆733Jan 22, 2025Updated last year
- ☆12Jun 25, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆46Jan 1, 2026Updated 5 months ago
- SpanMarker for Named Entity Recognition☆476Apr 10, 2026Updated 2 months ago
- Knowledge Extraction For Forms Accelerators & Examples☆223Jul 9, 2024Updated last year
- PyTorch ObjectDetection Modules and ONNX ops☆18Jun 12, 2023Updated 3 years ago
- ☆10Jul 12, 2023Updated 2 years ago
- This is the implementation of the TextNAS algorithm proposed in the paper TextNAS: A Neural Architecture Search Space tailored for Text R…☆15Nov 28, 2022Updated 3 years ago
- tempeh is a framework to TEst Machine learning PErformance exHaustively which includes tracking memory usage and run time.☆18Jan 3, 2022Updated 4 years ago
- Microsoft Cognitive Services, Computer Vision API, OCR Visualizer on documents☆19Dec 8, 2022Updated 3 years ago
- The note taking app that doesn't suck☆16May 27, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Fuzzy matching and more functionality for spaCy.☆258Jul 6, 2024Updated last year
- Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.☆19Nov 9, 2023Updated 2 years ago
- An NLP pipeline for COVID-19 surveillance used in the Department of Veterans Affairs Biosurveillance.☆15Oct 20, 2022Updated 3 years ago
- ☆18Jan 13, 2025Updated last year
- This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…☆21Mar 20, 2026Updated 2 months ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- A spaCy wrapper for GliNER☆135Jan 29, 2025Updated last year
- ☆14Feb 1, 2021Updated 5 years ago
- Language detection using Spacy and Fasttext☆54Dec 17, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Cloud Scanner is a cloud agnostic tool that extracts cloud based resources from cloud providers like Azure and ingests them into a config…☆13Dec 8, 2018Updated 7 years ago
- Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts)☆3,263Jun 2, 2026Updated 2 weeks ago
- Clean personally identifiable information from dirty dirty text using spaCy.☆41Sep 1, 2023Updated 2 years ago
- Demonstrate samples and good engineering practice for operationalizing machine learning solutions.☆20Dec 2, 2021Updated 4 years ago
- mermaid loader for webpack☆15Oct 25, 2016Updated 9 years ago
- Self-Supervision for Named Entity Disambiguation at the Tail☆218Jun 14, 2022Updated 4 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆105Apr 23, 2024Updated 2 years ago