Intsights / PySubstringSearchLinks
Python library for fast substring/pattern search written in C++ leveraging Suffix Array Algorithm
☆41Updated 3 months ago
Alternatives and similar repositories for PySubstringSearch
Users that are interested in PySubstringSearch are comparing it to the libraries listed below
Sorting:
- Python library for a duplicate lines removal written in C++☆33Updated 3 months ago
- Intsights open-source wrappers library for some AWS resources and high level management objects for distributed backend systems☆17Updated last year
- Python library for fast fuzzy search over a big file written in Rust☆45Updated 3 months ago
- A blazingly fast domain extraction library written in Rust☆66Updated 3 months ago
- A Git Repository Secrets Scanner written in Rust☆39Updated 3 months ago
- Concatenated-word segmentation Python library written in Rust☆17Updated 3 months ago
- Queue server base on RocksDB as a KV-Store backend and gRPC as an interface☆10Updated last year
- Uncompromising and opinionated flake8 plugin which follows Intsights' practices☆14Updated 3 months ago
- Word frequency checker based on Wikipedia corpus written in Rust☆10Updated 3 months ago
- A Python based alternative to Elasticsearch Reindex API with multiprocessing support.☆17Updated 3 months ago
- A fast and easy adblockplus parser and matcher based on adblock-rust package☆27Updated 3 months ago
- Multi-Langauge Identification☆28Updated 10 months ago
- A Fast Levenshtein Distance Library for Python☆83Updated 3 months ago
- Bounded Process&Thread Pool Executor☆64Updated last year
- Run all the tests at the same time with modal.com☆11Updated last year
- Rust python bindings for symspell☆19Updated last year
- Convert JSON Schemas to simple, human-readable Markdown documentation. Repo archived in favor of fork: sbrunner/jsonschema2md2☆26Updated last year
- auto fix invalid json / 自动修复补全残缺无效的 JSON☆55Updated last year
- Python 3 library to store memory mappable objects into pickle-compatible files☆38Updated 6 years ago
- Python package for deduplication/entity resolution using active learning☆80Updated 9 months ago
- Scale your ML workers asynchronously across processes and machines☆13Updated 2 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆25Updated 6 months ago
- code and data used to build a training dataset for dragnet models☆10Updated 4 years ago
- A query tool on networkx☆17Updated 4 months ago
- A file utility for accessing both local and remote files through a unified interface.☆42Updated 3 weeks ago
- Fast fuzzy text search☆11Updated 2 years ago
- Python module and CLI for hashing of file system directories based on the Dirhash Standard.☆55Updated 10 months ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Annotation Management for Prodigy, that support multiple users working in many projects☆15Updated 6 years ago
- Library for fast text representation and classification.☆28Updated last year