edwardcooper / piidetect
A package to build an end-to-end pipeline for detecting personally identifiable information from text.
☆43Updated 5 years ago
Related projects: ⓘ
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)☆67Updated last year
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆41Updated 3 years ago
- S3 vector database for LLM Agents and RAG.☆28Updated last year
- A project to build a machine learning pipeline to detect personal identifiable information (PII)☆16Updated last year
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆17Updated this week
- This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire…☆165Updated last month
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆43Updated 2 months ago
- Language detection using Spacy and Fasttext☆53Updated 9 months ago
- Python package for deduplication/entity resolution using active learning☆77Updated 3 weeks ago
- ☆57Updated 2 years ago
- ☆47Updated last year
- ☆32Updated last month
- Search for PII in Python☆26Updated 7 months ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆32Updated 10 months ago
- Library for identification, anonymization and de-anonymization of PII data☆22Updated last year
- ☆28Updated 4 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆61Updated 6 months ago
- A personal knowledge base that I can dump information to and help me learn☆24Updated 3 months ago
- A Docker Wrapper to make the machine easily learn any language on top of INRIA OSCAR dataset using GPT2☆10Updated 4 years ago
- A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector database☆49Updated last month
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆37Updated 2 years ago
- ☆14Updated 9 months ago
- Train a model, and detect gibberish strings with it.☆59Updated 2 years ago
- ☆65Updated 2 years ago
- ☄️ Parallel and distributed training with spaCy and Ray☆54Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆57Updated 4 months ago
- A simple search engine to search medium stories built with streamlit and elasticsearch.☆40Updated 2 years ago
- ☆46Updated 6 months ago
- Command Line Interface for Hugging Face Inference Endpoints☆65Updated 5 months ago
- Library for iPython notebooks for evaluating factuality.☆50Updated last year