Super Fast String Matching in Python
☆370Mar 14, 2025Updated last year
Alternatives and similar repositories for string_grouper
Users that are interested in string_grouper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python package to accelerate the sparse matrix multiplication and top-n similarity selection☆421Mar 9, 2026Updated 2 weeks ago
- Fuzzy string matching, grouping, and evaluation.☆792Jul 10, 2025Updated 8 months ago
- Python wrapper for a C++ Double Metaphone☆15Jan 12, 2026Updated 2 months ago
- spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linking☆86Oct 6, 2022Updated 3 years ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,048Feb 21, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆43Apr 20, 2023Updated 2 years ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Rapid fuzzy string matching in Python using various string metrics☆3,789Mar 11, 2026Updated 2 weeks ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- Company Name Processor written in Python☆351Jan 16, 2026Updated 2 months ago
- Generate reports for spaCy models.☆29May 27, 2022Updated 3 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- Sentence transformers models for SpaCy☆108Mar 9, 2023Updated 3 years ago
- Match Patent Assignees with Compustat and SDC via Bing Search☆54Sep 29, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,448Jul 29, 2025Updated 7 months ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆150Oct 16, 2024Updated last year
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆2,013Mar 19, 2026Updated last week
- ☆13Sep 2, 2021Updated 4 years ago
- A Cython implementation of the affine gap string distance☆57Jan 23, 2023Updated 3 years ago
- source{d} MLonCode foundation - core algorithms and models.☆14Oct 17, 2019Updated 6 years ago
- Dash Component created from ukrbublik/react-awesome-query-builder☆12Mar 16, 2026Updated last week
- Extra blocks for scikit-learn pipelines.☆1,383Mar 12, 2026Updated 2 weeks ago
- Python package for performing Entity and Text Matching using Deep Learning.☆615Jun 18, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Media search's code☆15Sep 15, 2018Updated 7 years ago
- 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.☆3,525Apr 18, 2025Updated 11 months ago
- A Simple Bulk Labelling Tool☆598Jul 29, 2025Updated 7 months ago
- Estimating Body Fat Using Computer Vision (openCV2, Python)☆23Dec 18, 2014Updated 11 years ago
- Text preprocessing, representation and visualization from zero to hero.☆2,909Aug 29, 2023Updated 2 years ago
- Concurrent (with OLC) Adaptive Radix Trie in Golang.☆11Jul 31, 2020Updated 5 years ago
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,201Mar 10, 2026Updated 2 weeks ago
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Dec 20, 2021Updated 4 years ago
- just a bunch of useful embeddings for scikit-learn pipelines☆523Feb 12, 2026Updated last month
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Simplifies use of the Dedupe library via Pandas☆137Mar 30, 2023Updated 2 years ago
- skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.☆507Updated this week
- Example of configuring multiplage apps via a custom config file☆18Nov 14, 2023Updated 2 years ago
- A Python library for calculating a large variety of metrics from text☆361Updated this week
- 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡☆85Feb 1, 2026Updated last month
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,208Feb 15, 2026Updated last month
- Implementation of Nested Named Entity Recognition using Flair☆24Oct 29, 2021Updated 4 years ago