OlivierBinette / StringCompare
Efficient String Comparison Functions and Fuzzy String Matching
☆17Updated 2 years ago
Alternatives and similar repositories for StringCompare:
Users that are interested in StringCompare are comparing it to the libraries listed below
- An End-to-End Evaluation Framework for Entity Resolution Systems☆27Updated last year
- A very simple library for exploiting graph-of-words in NLP☆12Updated 4 years ago
- Perform Bayesian record linkage with a one-to-one matching assumption.☆11Updated 4 years ago
- ☆10Updated 4 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- Similarity and distance measures for clustering and record linkage applications in R☆18Updated 3 years ago
- Blocking records for record linkage and data deduplication based on ANN algorithms in Python.☆12Updated 3 weeks ago
- ☆32Updated 3 years ago
- A tutorial on entity resolution (record linkage or de-duplication)☆62Updated 4 years ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆144Updated 6 months ago
- Fast, flexible name matching for large datasets☆71Updated last year
- Entity resolution using zero labeled examples☆28Updated 9 months ago
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆76Updated 2 weeks ago
- Classify names by gender, U.S. ethnicity, or leaf nationality☆19Updated 6 years ago
- NLP tasks with zero- and few-shot models.☆15Updated last week
- This project focuses on DeepER, a deep learning framework for entity resolution (record deduplication). It examines how DeepER performs o…☆46Updated 6 years ago
- Easy PDF to text to spaCy text extraction in Python.☆39Updated 6 months ago
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆71Updated last month
- This repository contains CROW, the Clerical Resolution Online Widget, an open-source project designed to help data linkers with their cle…☆10Updated this week
- A Fuzzy Matching Approach for Clustering Strings☆26Updated 2 years ago
- ☆31Updated 3 months ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).☆15Updated 3 years ago
- The FBAdLibrarian is a simple tool that can pull ad data and collects images offered by Facebook’s Ad Library API.☆15Updated 2 years ago
- INFO 5613 Network Science☆22Updated 3 years ago
- Blazing fast topic modelling for short texts.☆31Updated last week
- Train, evaluate, and use different unsupervised topic modelling algorithms using a RESTful API.☆36Updated last year
- Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite☆91Updated last year
- Java program for computing the intermediacy of the nodes in a graph for the specified source and target nodes.☆11Updated 5 years ago
- Implements an algorithim for Latent Dirichlet Allocation using style conventions from the [tidyverse](https://style.tidyverse.org/) and […☆41Updated 3 months ago