OlivierBinette / StringCompare
Efficient String Comparison Functions and Fuzzy String Matching
☆17Updated 2 years ago
Related projects: ⓘ
- An End-to-End Evaluation Framework for Entity Resolution Systems☆24Updated 9 months ago
- A tutorial on entity resolution (record linkage or de-duplication)☆61Updated 4 years ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆134Updated 2 years ago
- ☆9Updated 3 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- A text processing pipeline for turning unstructured text data into hierarchical datasets☆13Updated 4 years ago
- pyspark-parallelised functions producing graph-theoretical metrics in connected component clusters for use in record-linkage (or other do…☆10Updated 10 months ago
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆69Updated 3 weeks ago
- Assessing and Improving data quality in OpenAlex☆11Updated last year
- A very simple library for exploiting graph-of-words in NLP☆12Updated 3 years ago
- ☆32Updated 3 years ago
- Fast, flexible name matching for large datasets☆69Updated 9 months ago
- Classify names by gender, U.S. ethnicity, or leaf nationality☆19Updated 5 years ago
- INFO 5613 Network Science☆22Updated 2 years ago
- Entity resolution using zero labeled examples☆26Updated 2 months ago
- ☆15Updated 2 years ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆15Updated last year
- Pre-print:☆11Updated 11 months ago
- Named Entity Disambiguation and Linking☆14Updated 3 months ago
- A rolling version of the Latent Dirichlet Allocation.☆12Updated 9 months ago
- List of entity resolution software and resources.☆31Updated 6 months ago
- Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite☆87Updated 7 months ago
- ☆22Updated 3 months ago
- A Fuzzy Matching Approach for Clustering Strings☆26Updated last year
- This project focuses on DeepER, a deep learning framework for entity resolution (record deduplication). It examines how DeepER performs o…☆45Updated 6 years ago
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆43Updated this week
- R backbone package - Extract the backbone from weighted and unweighted networks☆40Updated 3 weeks ago
- A browser user interface for manual labeling of record pairs.☆41Updated last year
- A data package for R containing historical datasets about gender☆23Updated 2 years ago