bsolomon1124 / demoji
Accurately find/replace/remove emojis in text strings
☆160Updated last year
Alternatives and similar repositories for demoji:
Users that are interested in demoji are comparing it to the libraries listed below
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆150Updated last year
- A fully customisable language detection pipeline for spaCy☆92Updated 5 years ago
- ☆168Updated 8 months ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- Lightning Fast Language Prediction 🚀☆165Updated 5 years ago
- spaCy + UDPipe☆160Updated 2 years ago
- name2nat: a Python package for nationality prediction from a name☆106Updated 4 years ago
- A compound word splitter for Python☆48Updated 3 years ago
- A Python library for working with and comparing language codes.☆342Updated 2 months ago
- Pythonic search engine based on PyLucene.☆125Updated 3 months ago
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.☆98Updated 4 years ago
- A small tool that EXPLains spACY parse results. See what I did there?☆83Updated 2 years ago
- Parse numbers written in natural language☆109Updated 3 months ago
- Cython wrapper on Hunspell Dictionary☆66Updated 7 months ago
- A python true casing utility that restores case information for texts☆88Updated 2 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆169Updated 3 years ago
- Language independent truecaser in Python.☆160Updated 3 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆238Updated 2 years ago
- Parse natural language time expressions in python☆131Updated 2 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Fuzzy matching and more functionality for spaCy.☆254Updated 7 months ago
- Homoglyphs: get similar letters, convert to ASCII, detect possible languages and UTF-8 group.☆80Updated 4 years ago
- Use ML-Annotate to label data for machine learning purposes☆107Updated 4 years ago
- 📂 Additional lookup tables and data resources for spaCy☆101Updated 3 weeks ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆65Updated last year
- Python Set subclass that supports searching by ngram similarity☆119Updated 3 years ago
- Text tokenization and sentence segmentation (segtok v2)☆201Updated 2 years ago
- A fast and memory-optimized string library for heavy-text manipulation in Python☆250Updated 4 years ago
- Extract text from HTML☆133Updated 4 years ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆68Updated 2 weeks ago