microsoft / presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
☆185Updated this week
Alternatives and similar repositories for presidio-research:
Users that are interested in presidio-research are comparing it to the libraries listed below
- SpanMarker for Named Entity Recognition☆414Updated 3 weeks ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆43Updated 5 years ago
- Zero and Few shot named entity & relationships recognition☆358Updated 2 months ago
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)☆79Updated last year
- A Python library to de-identify medical records with state-of-the-art NLP methods.☆124Updated last year
- Fiddler Auditor is a tool to evaluate language models.☆174Updated 10 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Updated last year
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆192Updated last year
- An open-source compliance-centered evaluation framework for Generative AI models☆123Updated last month
- Annotated corpus + evaluation metrics for text anonymisation☆53Updated 11 months ago
- Open source no-code system for text annotation and building of text classifiers☆254Updated 5 months ago
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated 2 years ago
- 📚 A curated list of papers & technical articles on AI Quality & Safety☆165Updated last year
- Robust de-identification of medical notes using transformer architectures☆50Updated 2 years ago
- Deliver safe & effective language models☆508Updated this week
- Metafeature Extraction for Unstructured Data☆101Updated 5 months ago
- Sample notebooks and prompts for LLM evaluation☆119Updated 2 months ago
- Data for the Chat With Your Data benchmark.☆128Updated last year
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆212Updated last week
- CUAD (NeurIPS 2021)☆403Updated last year
- Explainable Zero-Shot Topic Extraction☆62Updated 5 months ago
- A spaCy wrapper for GliNER☆105Updated this week
- Nesta's Skills Extractor Library☆126Updated 2 months ago
- Domain Adapted Language Modeling Toolkit - E2E RAG☆313Updated 2 months ago
- Find and fix bugs in natural language machine learning models using adaptive testing.☆181Updated 8 months ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆90Updated 3 years ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆104Updated 8 months ago
- Practical examples of "Flawed Machine Learning Security" together with ML Security best practice across the end to end stages of the mach…☆105Updated 2 years ago
- Efficiently find the best-suited language model (LM) for your NLP task☆114Updated 2 weeks ago
- Mistral + Haystack: build RAG pipelines that rock 🤘☆100Updated 11 months ago