edwardcooper / piidetect
A package to build an end-to-end pipeline for detecting personally identifiable information from text.
☆43Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for piidetect
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆43Updated 3 years ago
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆44Updated 4 months ago
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆17Updated 3 weeks ago
- Language detection using Spacy and Fasttext☆54Updated 11 months ago
- ☆46Updated last year
- Python package for deduplication/entity resolution using active learning☆78Updated 2 months ago
- ☆29Updated 2 years ago
- ☆34Updated 3 months ago
- A fully-featured multi-source data pipeline for continuously extracting knowledge from COVID-19 data.☆21Updated 3 years ago
- ☄️ Parallel and distributed training with spaCy and Ray☆54Updated last year
- Record matching and entity resolution at scale in Spark☆31Updated last year
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 3 years ago
- Package that returns a company embedding given a company name☆42Updated 4 years ago
- ☆21Updated 3 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆62Updated 8 months ago
- ☆29Updated last week
- Library for identification, anonymization and de-anonymization of PII data☆22Updated last year
- ☆35Updated 2 years ago
- Generate reports for spaCy models.☆28Updated 2 years ago
- A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector database☆52Updated 3 months ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆27Updated 2 years ago
- List of entity resolution software and resources.☆38Updated 8 months ago
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K …☆75Updated 4 months ago
- Deploy mlflow models as JSON APIs with minimal new code☆19Updated last year
- a convenient way to anonymize your data for analytics☆20Updated 3 years ago
- Aim-spaCy integration☆34Updated last year
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)☆74Updated last year
- With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this…☆21Updated last year
- 📚 Process PDFs, Word documents and more with spaCy☆75Updated this week
- Dataframe Integration with spaCy.☆101Updated 3 years ago