WojciechMula / pyahocorasick
Python module (C extension and plain python) implementing Aho-Corasick algorithm
☆952Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for pyahocorasick
- Pure python Aho-Corasick library.☆212Updated last year
- Static memory-efficient Trie-like structures for Python based on marisa-trie C++ library.☆1,047Updated last month
- A Python Implementation of Simhash Algorithm☆983Updated 2 years ago
- Fast, efficiently stored Trie for Python. Uses libdatrie.☆531Updated 9 months ago
- A python binding for crfsuite☆771Updated last month
- High performance Trie and Ahocorasick automata (AC automata) Keyword Match & Replace Tool for python. Correct case insensitive implementa…☆94Updated last month
- A library implementing different string similarity and distance measures using Python.☆992Updated 2 years ago
- Python extension module for accelerating regular expressions using libesm☆132Updated last year
- Python library implementing a trie data structure.☆816Updated 3 years ago
- Simhash and near-duplicate detection☆411Updated last year
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆802Updated this week
- Fast multi-keyword search engine for text strings☆247Updated 2 months ago
- ☆165Updated 5 months ago
- sentence embedding by Smooth Inverse Frequency weighting scheme☆1,083Updated 5 years ago
- Python Keyphrase Extraction module☆1,567Updated last year
- An efficient simhash implementation for python☆125Updated 5 years ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆365Updated last year
- Python wrapper for Stanford CoreNLP.☆921Updated 2 years ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆1,264Updated 3 years ago
- Named Entity Recognition Tool☆1,159Updated 5 years ago
- MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW☆2,586Updated 5 months ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆382Updated 2 years ago
- CRF++: Yet Another CRF toolkit☆506Updated 3 years ago
- Learning Named Entity Tagger from Domain-Specific Dictionary☆483Updated 5 years ago
- AutoPhrase: Automated Phrase Mining from Massive Text Corpora☆1,175Updated 2 years ago
- Tensorflow implementation of contextualized word representations from bi-directional language models☆1,620Updated last year
- Fast Python Bloom Filter using Mmap☆130Updated 6 months ago
- NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character …☆1,890Updated 2 years ago
- Pre-trained ELMo Representations for Many Languages☆1,463Updated 3 years ago
- scikit-learn inspired API for CRFsuite☆426Updated last year