Find parts of long text or data, allowing for some changes/typos.
☆339Nov 11, 2025Updated 3 months ago
Alternatives and similar repositories for fuzzysearch
Users that are interested in fuzzysearch are comparing it to the libraries listed below
Sorting:
- Fuzzy String Matching in Python☆9,264Feb 24, 2023Updated 3 years ago
- [experiment] CRF-based disambiguation engine for pymorphy2☆10May 9, 2016Updated 9 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆157May 24, 2024Updated last year
- A pure Python implementation of Aho-Corasick algorithm.☆23Jul 10, 2018Updated 7 years ago
- Rapid fuzzy string matching in Python using various string metrics☆3,751Mar 2, 2026Updated last week
- ☆18Jun 12, 2023Updated 2 years ago
- Fuzzy String Matching in Python☆3,581Mar 3, 2025Updated last year
- AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging (ACL 2019 Demo)☆137Jan 5, 2023Updated 3 years ago
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆1,089Dec 17, 2025Updated 2 months ago
- Fuzzy string matching, grouping, and evaluation.☆791Jul 10, 2025Updated 8 months ago
- This repository contains materials for the Open Legal Data Forum at the Legal Hacker 2019 (September 2019 + Brooklyn, NYC)☆17Dec 8, 2022Updated 3 years ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- A multi-language segmenter using high-order CRF.☆17Feb 27, 2020Updated 6 years ago
- A simple and fast rule-based sentence segmentation. Tested on OpenCorpora and SynTagRus datasets.☆52Jul 4, 2018Updated 7 years ago
- Code Snippets & DataSets for Business Analytics & Data Mining/ Machine Learning Algorithms☆15Apr 23, 2018Updated 7 years ago
- Streamlit apps on Cloud Run with Identity-Aware Proxy (IAP).☆10Mar 5, 2022Updated 4 years ago
- A machine learning tool for fishing entities☆270Feb 27, 2026Updated last week
- Extract Keywords from sentence or Replace keywords in sentences.☆5,708Apr 13, 2025Updated 10 months ago
- ☆19Dec 19, 2018Updated 7 years ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,210Feb 15, 2026Updated 3 weeks ago
- Repository for experiments with MetaProd2Vec and related algorithms.☆59Mar 16, 2019Updated 6 years ago
- Tool for parsing and converting various span encoding schemes.☆23Jan 13, 2024Updated 2 years ago
- DeFactoNLP: An Automated Fact-checking System that uses Named Entity Recognition, TF-IDF vector comparison and Decomposable Attention mod…☆41May 25, 2020Updated 5 years ago
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆905Aug 20, 2024Updated last year
- jiant is an nlp toolkit☆1,674Jul 6, 2023Updated 2 years ago
- Segment documents into coherent parts using word embeddings.☆149Mar 6, 2022Updated 4 years ago
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,193Mar 3, 2026Updated last week
- Recent trends of Entity Linking, Disambiguation, and Representation.☆346Jun 26, 2021Updated 4 years ago
- Time Series Anomaly Detection Toolkit☆23Mar 8, 2019Updated 7 years ago
- a tor socks proxy docker image☆12Feb 10, 2026Updated last month
- Language Independent Test Format☆11Oct 9, 2021Updated 4 years ago
- LaTeX to .pdf with Travis-CI☆12Feb 23, 2018Updated 8 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Apr 30, 2025Updated 10 months ago
- API client for fetching and comparing passages from legislation☆14Jan 26, 2025Updated last year
- CRUD Word documents with Python☆13Feb 5, 2026Updated last month
- Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…☆11Jan 26, 2021Updated 5 years ago
- Building or integrating an LLM wrapper shouldn't take more than 10 minutes.☆13Feb 1, 2025Updated last year
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆13Jan 2, 2021Updated 5 years ago
- A file-backed dictionary for Python☆12Aug 15, 2022Updated 3 years ago