amir-zeldes / RFTokenizer
A character-wise tokenizer for morphologically rich languages
☆27Updated 2 weeks ago
Alternatives and similar repositories for RFTokenizer:
Users that are interested in RFTokenizer are comparing it to the libraries listed below
- An NLP pipeline for Hebrew☆37Updated 2 weeks ago
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆23Updated last year
- Python framework for processing Universal Dependencies data☆55Updated this week
- German Morphological Analyzer☆47Updated 3 years ago
- Runnable morphological analysis tools from the UniMorph project☆15Updated 6 years ago
- ☆64Updated 10 months ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.☆56Updated last month
- Python Finite-State Toolkit☆53Updated 3 weeks ago
- A tool for automatic spelling normalization☆20Updated 4 years ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammars☆17Updated 9 months ago
- A simple configurable tool for manipulating dependency trees.☆13Updated 2 months ago
- Poetry Corpora Annotated on Aesthetic Emotions☆11Updated 2 years ago
- ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with…☆75Updated last month
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- Arabic NER system with a strong performance☆35Updated 5 years ago
- The NLG tool for Finnish☆22Updated last year
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆191Updated 4 years ago
- Compiled tools, datasets, and other resources for historical text normalization.☆18Updated 5 years ago
- Repository for the Georgetown University Multilayer Corpus (GUM)☆93Updated this week
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆41Updated last year
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆25Updated last year
- A tokenizer and sentence splitter for German and English web and social media texts.☆139Updated 3 months ago
- Scripts for compatibilitising between VISL-CG3, Apertium, CoNLL-X and Universal Dependencies☆15Updated 5 years ago
- Annotation tool for coreference☆31Updated last year
- Python API to access glottolog/glottolog☆29Updated 4 months ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆112Updated last month
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆30Updated 2 weeks ago
- A multilingual parallel corpus created from translations of the Bible.☆178Updated 6 months ago