bsolomon1124 / demoji
Accurately find/replace/remove emojis in text strings
☆153Updated 9 months ago
Related projects: ⓘ
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆148Updated last year
- ☆159Updated 3 months ago
- Lightning Fast Language Prediction 🚀☆163Updated 5 years ago
- A fully customisable language detection pipeline for spaCy☆93Updated 5 years ago
- A Python library for working with and comparing language codes.☆339Updated 5 months ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆64Updated last year
- A python module for word inflections designed for use with spaCy.☆90Updated 4 years ago
- Pythonic search engine based on PyLucene.☆119Updated 2 months ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆66Updated 2 weeks ago
- A Python module to convert natural language numerics into ints and floats.☆211Updated last year
- ☆46Updated this week
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆170Updated 2 years ago
- A python true casing utility that restores case information for texts☆88Updated last year
- Language detection using Spacy and Fasttext☆53Updated 9 months ago
- python library to simplify working with jsonlines and ndjson data☆264Updated last month
- Find strings/words in text; convenience and C speed☆125Updated 2 years ago
- Python port of Boilerpipe library☆81Updated last month
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆144Updated 8 months ago
- A spaCy wrapper for DBpedia Spotlight☆103Updated last year
- Textpipe: clean and extract metadata from text☆300Updated 3 years ago
- URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.☆241Updated 6 months ago
- Sentence transformers models for SpaCy☆104Updated last year
- Fast and robust date extraction from web pages, with Python or on the command-line☆118Updated 2 weeks ago
- Python 3 library for reading and writing warc files☆21Updated 6 years ago
- A compound word splitter for Python☆48Updated 3 years ago
- Fuzzy matching and more functionality for spaCy.☆249Updated 2 months ago
- Parse numbers written in natural language☆104Updated this week
- A Python implementation of Lunr.js 🌖☆188Updated last week
- Extract dates from text☆64Updated 3 years ago
- Use ML-Annotate to label data for machine learning purposes☆104Updated 4 years ago