microsoft / presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
☆165Updated last month
Related projects: ⓘ
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆240Updated last year
- ☆34Updated last year
- Spacy NER annotator using ipywidgets☆120Updated 5 months ago
- SpanMarker for Named Entity Recognition☆384Updated last month
- CUAD (NeurIPS 2021)☆379Updated last year
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆43Updated 5 years ago
- Public runnable examples of using John Snow Labs' OCR for Apache Spark.☆85Updated last week
- Models and Pipelines for the Spark NLP library☆112Updated 3 years ago
- All the goto functions you need to handle NLP use-cases, integrated in NLPretext☆139Updated 5 months ago
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)☆67Updated last year
- Fuzzy matching and more functionality for spaCy.☆249Updated 2 months ago
- Clustering sentence embeddings to extract message intent☆166Updated 2 years ago
- Scripts for Medium articles☆58Updated 3 months ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆87Updated 2 years ago
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆208Updated 3 months ago
- Explainable Zero-Shot Topic Extraction☆62Updated last month
- A repository that showcases how you can use ZenML with Git☆62Updated last month
- A comprehensive reference for all topics related to building and maintaining microservices☆67Updated last year
- Zero and Few shot named entity & relationships recognition☆340Updated this week
- SpikeX - SpaCy Pipes for Knowledge Extraction☆397Updated 3 years ago
- A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data…☆241Updated 4 months ago
- A library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in …☆413Updated 7 months ago
- Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.☆21Updated last year
- Evaluation of language models on mono- or multilingual tasks.☆71Updated last month
- Project for open sourcing research efforts on Backward Compatibility in Machine Learning☆70Updated 11 months ago
- Enterprise Scale NLP with Hugging Face & SageMaker Workshop series☆228Updated last year
- 🍳 Recipes for the Prodigy, our fully scriptable annotation tool☆477Updated last month
- Few-shot Named Entity Recognition☆121Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 6 months ago
- ✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3☆318Updated last year