casics / nostril
Nostril: Nonsense String Evaluator
☆193Updated 2 years ago
Alternatives and similar repositories for nostril:
Users that are interested in nostril are comparing it to the libraries listed below
- A small program to detect gibberish using a Markov Chain☆603Updated 11 months ago
- Python wrapper for ssdeep fuzzy hashing library☆150Updated 3 years ago
- Stringlifier is on Opensource ML Library for detecting random strings in raw text. It can be used in sanitising logs, detecting accidenta…☆167Updated 10 months ago
- Find strings/words in text; convenience and C speed☆126Updated 2 years ago
- Simple heuristic for measuring web page similarity (& data set)☆90Updated 6 years ago
- Parse natural language time expressions in python☆131Updated 2 years ago
- 🐍 A CPython extension for the Hyperscan regular expression matching library.☆167Updated this week
- URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.☆253Updated 11 months ago
- Compare html similarity using structural and style metrics☆209Updated last year
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆371Updated 2 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆149Updated last year
- 📂 Additional lookup tables and data resources for spaCy☆100Updated 3 weeks ago
- Tokenizer for raw mails☆378Updated this week
- Lightning Fast Language Prediction 🚀☆165Updated 5 years ago
- Fuzzy matching and more functionality for spaCy.☆254Updated 7 months ago
- Python wrapper for RE2☆100Updated 5 months ago
- Fixes contractions such as `you're` to `you are`☆315Updated 2 years ago
- Abydos NLP/IR library for Python☆184Updated 2 years ago
- Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.☆75Updated 3 years ago
- Code for the paper URLNet - Learning a URL Representation with Deep Learning for Malicious URL Detection☆156Updated 4 years ago
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆181Updated last year
- Character-based word embeddings model based on RNN for handling real world texts☆173Updated last year
- Accurately find/replace/remove emojis in text strings☆160Updated last year
- Language detection extension for spaCy 2.0+☆112Updated 6 years ago
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆817Updated 2 weeks ago
- (Official repo for pypi package) Python bindings for the Hunspell spellchecker engine☆186Updated 4 years ago
- Python port of Boilerpipe library☆86Updated 5 months ago
- 📝Natural language processing (NLP) utils: word embeddings (Word2Vec, GloVe, FastText, ...) and preprocessing transformers, compatible wi…☆62Updated last year
- Extracts the top level domain (TLD) from the URL given.☆181Updated last year
- Locality-sensitive hashing algorithm for text similarity comparisons☆58Updated 3 years ago