aalto-speech / morfessor
Morfessor is a tool for unsupervised and semi-supervised morphological segmentation
☆191Updated 4 years ago
Alternatives and similar repositories for morfessor:
Users that are interested in morfessor are comparing it to the libraries listed below
- Various utilities for processing the data.☆208Updated this week
- Corpus preprocessing☆95Updated last year
- Efficient Low-Memory Aligner☆142Updated 2 months ago
- Automatic extraction of edited sentences from text edition histories.☆82Updated 3 years ago
- English data☆206Updated this week
- Fast supervised sentence boundary detection using the averaged perceptron☆90Updated 6 years ago
- Appraise evaluation system for manual evaluation of machine translation output☆74Updated 3 years ago
- Tool for comparison and evaluation of machine translation.☆56Updated 2 years ago
- Python framework for processing Universal Dependencies data☆55Updated this week
- Efficient Markov Chain word alignment☆53Updated 3 years ago
- Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies☆70Updated 6 years ago
- PredPatt: Predicate-Argument Extraction from Universal Dependencies☆111Updated 4 years ago
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆222Updated 2 years ago
- A word alignment tool based on famous GIZA++, extended to support multi-threading, resume training and incremental training.☆161Updated 3 years ago
- SemCor and Masc documents annotated with NOAD word senses.☆183Updated 5 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆155Updated 9 months ago
- Democratizing NLP!☆104Updated last year
- ☆47Updated 8 months ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtities☆114Updated 2 years ago
- Bitextor generates translation memories from multilingual websites☆292Updated 4 months ago
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆314Updated last month
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 8 years ago
- TED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted…☆248Updated 9 years ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆149Updated last year
- Sentence aligner☆112Updated 3 years ago
- Automatically exported from code.google.com/p/universal-pos-tags☆129Updated 2 years ago
- eXtensible Neural Machine Translation☆185Updated 5 years ago
- The Open Multilingual Wordnet☆61Updated 10 months ago
- General-Purpose Neural Networks for Sentence Boundary Detection☆72Updated 2 years ago
- ☆42Updated 6 years ago