keredson / wordninja
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.
☆849Updated 2 years ago
Alternatives and similar repositories for wordninja:
Users that are interested in wordninja are comparing it to the libraries listed below
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆824Updated 3 weeks ago
- Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenizati…☆668Updated last year
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/☆740Updated last month
- Fuzzy string matching, grouping, and evaluation.☆758Updated 2 months ago
- Compute Sentence Embeddings Fast!☆622Updated 2 years ago
- Port of Google's language-detection library to Python.☆1,787Updated last month
- PYthon Automated Term Extraction☆311Updated 2 years ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆413Updated 2 months ago
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆731Updated 8 months ago
- NeuSpell: A Neural Spelling Correction Toolkit☆692Updated last year
- Stand-alone language identification system☆2,375Updated 5 years ago
- NLP, before and after spaCy☆2,224Updated last year
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆151Updated last year
- Python Keyphrase Extraction module☆1,581Updated last year
- Textpipe: clean and extract metadata from text☆301Updated 3 years ago
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆846Updated 8 months ago
- Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.☆1,067Updated 2 years ago
- Fixes contractions such as `you're` to `you are`☆317Updated 2 years ago
- spellchecking library for python☆609Updated 10 months ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python☆268Updated last year
- TextRank implementation for Python 3.☆1,255Updated 2 years ago
- A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)☆1,127Updated 7 months ago
- Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing☆558Updated 5 months ago
- Toolkit to help understand "what lies" in word embeddings. Also benchmarking!☆472Updated 2 years ago
- A python binding for crfsuite☆772Updated 6 months ago
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,381Updated 2 months ago
- Single-document unsupervised keyword extraction☆1,709Updated last month
- A sentence segmenter that actually works!☆305Updated 4 years ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆374Updated 2 years ago
- Implementation of the paper: Text Segmentation as a Supervised Learning Task☆261Updated 5 years ago