microsoft / presidio-researchLinks
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
☆253Updated 2 weeks ago
Alternatives and similar repositories for presidio-research
Users that are interested in presidio-research are comparing it to the libraries listed below
Sorting:
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)☆94Updated 2 weeks ago
- SpanMarker for Named Entity Recognition☆462Updated 11 months ago
- Deliver safe & effective language models☆548Updated this week
- Robust de-identification of medical notes using transformer architectures☆57Updated 3 years ago
- Public blueprints for data use cases☆85Updated 3 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Updated 2 years ago
- CUAD (NeurIPS 2021)☆463Updated 2 years ago
- Fiddler Auditor is a tool to evaluate language models.☆188Updated last year
- Generates synthetic data and user interfaces for privacy-preserving data sharing and analysis.☆123Updated last year
- Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.☆85Updated 3 months ago
- A spaCy wrapper for GliNER☆126Updated 11 months ago
- Zero and Few shot named entity & relationships recognition☆398Updated 3 months ago
- Metafeature Extraction for Unstructured Data☆103Updated 9 months ago
- A Python library to de-identify medical records with state-of-the-art NLP methods.☆142Updated last month
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack☆177Updated this week
- Efficiently find the best-suited language model (LM) for your NLP task☆132Updated 5 months ago
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆321Updated 5 months ago
- Annotated corpus + evaluation metrics for text anonymisation☆70Updated 5 months ago
- ✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3☆323Updated 2 years ago
- 🍳 Recipes for the Prodigy, our fully scriptable annotation tool☆504Updated last year
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆92Updated 4 years ago
- ☆39Updated 2 years ago
- Synthetic Data SDK ✨☆694Updated this week
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆220Updated 11 months ago
- Synthetic data generators for structured and unstructured text, featuring differentially private learning.☆669Updated 6 months ago
- ☆33Updated 3 years ago
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)☆477Updated 5 months ago
- A curated list of awesome synthetic data tools (open source and commercial).☆230Updated last year
- Pebblo enables developers to safely load data and promote their Gen AI app to deployment☆149Updated 6 months ago
- 📚 Process PDFs, Word documents and more with spaCy☆832Updated 9 months ago