domanchi / gibberish-detectorLinks
Train a model, and detect gibberish strings with it.
β67Updated 3 years ago
Alternatives and similar repositories for gibberish-detector
Users that are interested in gibberish-detector are comparing it to the libraries listed below
Sorting:
- π A CPython extension for the Hyperscan regular expression matching library.β188Updated last month
- Find strings/words in text; convenience and C speedβ127Updated 3 years ago
- Pythonic search engine based on PyLucene.β131Updated this week
- 80x faster and 95% accurate language identification with Fasttextβ162Updated last year
- β69Updated 3 years ago
- Nostril: Nonsense String Evaluatorβ199Updated 3 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)β155Updated 2 years ago
- Check for multiple patterns in a single string at the same time: a fast Aho-Corasick algorithm for Pythonβ217Updated this week
- Simply, faster, sentence-transformersβ143Updated last year
- Fast and robust date extraction from web pages, with Python or on the command-lineβ142Updated 3 weeks ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.β48Updated 6 years ago
- A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)β94Updated last month
- A fully customisable language detection pipeline for spaCyβ93Updated 6 years ago
- Python package for deduplication/entity resolution using active learningβ82Updated last year
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacingβ77Updated 3 weeks ago
- A python based HTML to text conversion library, command line client and Web service.β325Updated last week
- Clean, filter and sample URLs to optimize data collection β Python & command-line β Deduplication, spam, content and language filtersβ150Updated 3 weeks ago
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.β125Updated last year
- ποΈ Highlight text in documentsβ109Updated 7 months ago
- A python package to simulate typographical errors.β38Updated last year
- A Python module to convert natural language numerics into ints and floats.β233Updated last year
- Python Simple Object Storage - provides a list and dictionary interface that seamlessly stores data in a file, like a simplified databaseβ¦β58Updated 2 years ago
- Confection: the sweetest config system for Pythonβ192Updated 3 weeks ago
- Fuzzy matching and more functionality for spaCy.β259Updated last year
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K β¦β83Updated 11 months ago
- A fast python implementation of the SimHash algorithm.β27Updated 4 years ago
- π Fine-tune OpenAI models for text classification, question answering, and moreβ17Updated 2 years ago
- β176Updated 8 months ago
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated last year
- β43Updated 2 years ago