mideind / TokenizerLinks
A tokenizer for Icelandic text
☆28Updated last week
Alternatives and similar repositories for Tokenizer
Users that are interested in Tokenizer are comparing it to the libraries listed below
Sorting:
- Overview of Icelandic NLP resources at a glance☆16Updated 11 months ago
- A fast, efficient natural language processing engine for Icelandic.☆62Updated 8 months ago
- spaCy + UDPipe☆161Updated 3 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Language detection extension for spaCy 2.0+☆113Updated 6 years ago
- Hunspell extension for spaCy 2.0.☆94Updated 10 months ago
- A lemmatizer for Icelandic text☆16Updated 7 years ago
- Cython wrapper on Hunspell Dictionary☆66Updated 11 months ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆113Updated last year
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆30Updated 5 years ago
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Updated 6 months ago
- NLTK Contrib☆166Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆31Updated 3 months ago
- Text tokenization and sentence segmentation (segtok v2)☆205Updated 3 years ago
- Language data store and linguistic query API☆40Updated last week
- Language Models for Zalando's flair library☆61Updated 5 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated this week
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆170Updated 3 years ago
- Multilingual syllable annotation pipeline component for spacy☆39Updated 2 years ago
- Sentence transformers models for SpaCy☆107Updated 2 years ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆151Updated last year
- ☆64Updated 2 years ago
- Python framework for processing Universal Dependencies data☆57Updated last week
- A simple toolkit for conducting analyses using corpus methods☆25Updated 3 years ago
- ☆25Updated 5 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆249Updated 2 years ago
- Efficient Low-Memory Aligner☆144Updated 4 months ago
- 📂 Additional lookup tables and data resources for spaCy☆105Updated this week
- The greynir.is Icelandic natural language processing API and website.☆65Updated 2 weeks ago