shivam5992 / dupandas
python package for performing deduplication using flexible text matching and cleaning in pandas dataframe
☆25Updated 4 years ago
Alternatives and similar repositories for dupandas
Users that are interested in dupandas are comparing it to the libraries listed below
Sorting:
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- A maximum-strength name parser for record linkage.☆37Updated last week
- Collection of code snippets and utilities for streamlit apps☆22Updated 5 years ago
- Predict age and gender from a first name☆60Updated 6 years ago
- ☄️ Parallel and distributed training with spaCy and Ray☆54Updated last year
- Set-oriented Operations in Pandas☆24Updated 4 years ago
- Scalable String Similarity Joins in Python☆39Updated 10 months ago
- A browser user interface for manual labeling of record pairs.☆47Updated last year
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- Multidimensional data explorer and visualization tool.☆56Updated 7 years ago
- Hidden alignment conditional random field for classifying string pairs.☆24Updated this week
- Calculate readability scores☆41Updated 6 years ago
- A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).☆15Updated 3 years ago
- Tutorial code and data for the entity resolution workshops.☆45Updated 9 years ago
- A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn fr…☆57Updated 3 years ago
- ☆70Updated 2 years ago
- 🧬 A VS Code extension for annotating data with Prodigy☆30Updated 3 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated this week
- ☆13Updated 6 years ago
- An in depth tutorial on sklearn's Pipeline and FeatureUnion classes.☆16Updated 8 years ago
- ☆30Updated 2 years ago
- Python package aiding in entity disambiguation based on string and location matching☆18Updated last year
- Predict whether a student will correctly answer a problem based on past performance using automated feature engineering☆32Updated 4 years ago
- This project is wraper for Leilex, legal entity identifier API. Includes ISIN-LEI conversion. Search LEI number using company name.☆24Updated 7 months ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆35Updated 4 years ago
- Comparing Polars to Pandas and a small introduction☆43Updated 3 years ago
- A fully-featured multi-source data pipeline for continuously extracting knowledge from COVID-19 data.☆21Updated 3 years ago
- Creating user interfaces for data science with Jupyter widgets☆11Updated 7 years ago
- Model drift detection☆11Updated last year
- Aho-Corasick string replacement utility☆24Updated 5 years ago