WojciechMula / pyahocorasick
Python module (C extension and plain python) implementing Aho-Corasick algorithm
☆967Updated 9 months ago
Alternatives and similar repositories for pyahocorasick:
Users that are interested in pyahocorasick are comparing it to the libraries listed below
- Pure python Aho-Corasick library.☆214Updated 2 years ago
- A Python Implementation of Simhash Algorithm☆995Updated 2 years ago
- Static memory-efficient Trie-like structures for Python based on marisa-trie C++ library.☆1,050Updated 3 months ago
- A python binding for crfsuite☆771Updated 3 months ago
- Fast, efficiently stored Trie for Python. Uses libdatrie.☆530Updated 11 months ago
- Python library implementing a trie data structure.☆818Updated 3 years ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆1,267Updated 3 years ago
- Python extension module for accelerating regular expressions using libesm☆132Updated last year
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆367Updated 2 years ago
- Fast implementation of the edit distance(Levenshtein distance)☆667Updated 11 months ago
- A library implementing different string similarity and distance measures using Python.☆998Updated 2 years ago
- MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW☆2,635Updated 7 months ago
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆810Updated 3 weeks ago
- Fast multi-keyword search engine for text strings☆250Updated 4 months ago
- Named Entity Recognition Tool☆1,161Updated 5 years ago
- Simhash and near-duplicate detection☆413Updated last year
- spellchecking library for python☆603Updated 6 months ago
- Python Keyphrase Extraction module☆1,571Updated last year
- AutoPhrase: Automated Phrase Mining from Massive Text Corpora☆1,173Updated 2 years ago
- ☆167Updated 7 months ago
- sentence embedding by Smooth Inverse Frequency weighting scheme☆1,086Updated 5 years ago
- Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.☆1,704Updated last year
- Check for multiple patterns in a single string at the same time: a fast Aho-Corasick algorithm for Python☆167Updated last month
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆385Updated 2 years ago
- Python Non-cryptographic Hash Library☆283Updated last year
- DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.☆300Updated 7 months ago
- CRFsuite: a fast implementation of Conditional Random Fields (CRFs)☆650Updated 6 months ago
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,083Updated 2 weeks ago
- Multilingual text (NLP) processing toolkit☆2,317Updated last year
- Python wrapper for Stanford CoreNLP.☆924Updated 3 years ago