luozhouyang / python-string-similarityLinks
A library implementing different string similarity and distance measures using Python.
☆1,020Updated 3 years ago
Alternatives and similar repositories for python-string-similarity
Users that are interested in python-string-similarity are comparing it to the libraries listed below
Sorting:
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆1,076Updated last month
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆1,277Updated 4 years ago
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆856Updated last month
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/☆769Updated last month
- Pure python Aho-Corasick library.☆220Updated last week
- Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.☆862Updated 2 years ago
- Python Keyphrase Extraction module☆1,586Updated 2 years ago
- Compute Sentence Embeddings Fast!☆624Updated 2 years ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆378Updated 3 years ago
- TextRank implementation for Python 3.☆1,268Updated 2 years ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python☆272Updated 2 years ago
- Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.☆1,077Updated 3 years ago
- A Python Implementation of Simhash Algorithm☆1,032Updated 3 years ago
- 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.☆3,510Updated 9 months ago
- ☆177Updated 9 months ago
- 🧹 Python package for text cleaning☆1,001Updated 2 years ago
- A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)☆1,173Updated last year
- GSDMM: Short text clustering☆357Updated 3 years ago
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆897Updated last year
- AutoPhrase: Automated Phrase Mining from Massive Text Corpora☆1,200Updated 3 years ago
- A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of lang…☆1,557Updated 7 months ago
- Single-document unsupervised keyword extraction☆1,812Updated last month
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,184Updated last month
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,205Updated last week
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆292Updated 2 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆155Updated 2 years ago
- Fuzzy string matching, grouping, and evaluation.☆787Updated 6 months ago
- Various Algorithms for Short Text Mining☆472Updated last week
- Stanford Open Information Extraction made simple!☆678Updated 2 years ago
- Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)☆345Updated 3 years ago