mideind / TokenizerLinks
A tokenizer for Icelandic text.
☆29Updated this week
Alternatives and similar repositories for Tokenizer
Users that are interested in Tokenizer are comparing it to the libraries listed below
Sorting:
- Cython wrapper on Hunspell Dictionary☆66Updated last year
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆111Updated 6 months ago
- Language independent truecaser in Python.☆159Updated 4 years ago
- Overview of Icelandic NLP resources at a glance☆16Updated last year
- A tokenizer and sentence splitter for German and English web and social media texts.☆150Updated last year
- Text tokenization and sentence segmentation (segtok v2)☆208Updated 3 years ago
- Hy-phen-ation made easy☆217Updated 9 months ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆255Updated 3 years ago
- 🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪☆79Updated 4 years ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆115Updated last year
- A Python 3 phonetics library.☆136Updated 5 years ago
- This packages up data for the Open Multilingual Wordnet☆59Updated 6 months ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆418Updated 10 months ago
- A lemmatizer for Icelandic text☆17Updated 7 years ago
- Open morphology for Finnish