Super Fast String Matching in Python
☆370Mar 14, 2025Updated last year
Alternatives and similar repositories for string_grouper
Users that are interested in string_grouper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python package to accelerate the sparse matrix multiplication and top-n similarity selection☆423Apr 9, 2026Updated last month
- Fuzzy string matching, grouping, and evaluation.☆796Jul 10, 2025Updated 10 months ago
- Python wrapper for a C++ Double Metaphone☆15Jan 12, 2026Updated 4 months ago
- spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linking☆86Oct 6, 2022Updated 3 years ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,049Feb 21, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆43Apr 20, 2023Updated 3 years ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Rapid fuzzy string matching in Python using various string metrics☆3,917May 11, 2026Updated 2 weeks ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆927Sep 2, 2024Updated last year
- Company Name Processor written in Python☆356Jan 16, 2026Updated 4 months ago
- Generate reports for spaCy models.☆29May 27, 2022Updated 4 years ago
- Sentence transformers models for SpaCy☆108Mar 9, 2023Updated 3 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,468Jul 29, 2025Updated 10 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A Flexible Deep Learning Approach to Fuzzy String Matching☆151Oct 16, 2024Updated last year
- A Cython implementation of the affine gap string distance☆57Jan 23, 2023Updated 3 years ago
- Dash Component created from ukrbublik/react-awesome-query-builder☆13May 18, 2026Updated last week
- Extra blocks for scikit-learn pipelines.☆1,396May 19, 2026Updated last week
- Python package for performing Entity and Text Matching using Deep Learning.☆616Jun 18, 2024Updated last year
- Media search's code☆14Sep 15, 2018Updated 7 years ago
- 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.☆3,529Apr 18, 2025Updated last year
- A Simple Bulk Labelling Tool☆597Jul 29, 2025Updated 10 months ago
- Estimating Body Fat Using Computer Vision (openCV2, Python)☆22Dec 18, 2014Updated 11 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Text preprocessing, representation and visualization from zero to hero.☆2,911Aug 29, 2023Updated 2 years ago
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,212Apr 7, 2026Updated last month
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Dec 20, 2021Updated 4 years ago
- just a bunch of useful embeddings for scikit-learn pipelines☆526Feb 12, 2026Updated 3 months ago
- Simplifies use of the Dedupe library via Pandas☆135Mar 30, 2023Updated 3 years ago
- Example of configuring multiplage apps via a custom config file☆18Nov 14, 2023Updated 2 years ago
- A Python library for calculating a large variety of metrics from text☆364May 5, 2026Updated 3 weeks ago
- 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡☆85Feb 1, 2026Updated 3 months ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,213Apr 22, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Simply, faster, sentence-transformers☆144Aug 27, 2024Updated last year
- Implementation of Nested Named Entity Recognition using Flair☆24Oct 29, 2021Updated 4 years ago
- A Python library for generating word tree diagrams☆28Jul 10, 2020Updated 5 years ago
- This repository highlights the workflow and ease of use of training machine learning or deep learning models using Azure Databricks. Then…☆32Feb 1, 2024Updated 2 years ago
- Group thousands of similar spreadsheet or database text entries in seconds☆158Jun 12, 2023Updated 2 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆261Aug 21, 2025Updated 9 months ago
- Tools for speech processing, keyword spotting☆16Mar 11, 2020Updated 6 years ago