nytud / emtsv
e-magyar text processing system -- inter-module communication via tsv + REST API
☆29Updated last week
Alternatives and similar repositories for emtsv:
Users that are interested in emtsv are comparing it to the libraries listed below
- PurePos is an open source hybrid morphological tagger.☆16Updated 4 years ago
- The home repository of the NerKor corpus, a Hungarian gold standard named entity annotated corpus containing 1 million tokens.☆15Updated last year
- Tools for compiling corpora from Common Crawl☆14Updated 5 months ago
- A tool for automatic spelling normalization☆20Updated 4 years ago
- LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilatio…☆68Updated last year
- This is an open-source sentiment analysis tool for Hungarian language, written in Python.☆11Updated 8 years ago
- Python Finite-State Toolkit☆54Updated 2 months ago
- eXternally configurable REference and Non Named Entity Recognizer☆17Updated 10 months ago
- The curation repository for the data behind Concepticon.☆38Updated 2 months ago
- Deutsches Lyrik Korpus (DLK) / German Poetry Corpus☆18Updated 11 months ago
- The NLG tool for Finnish☆22Updated last year
- Polish data.☆11Updated 5 months ago
- Featurize words into orthographic and phonological vectors.☆40Updated last year
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆112Updated 3 months ago
- Python framework for processing Universal Dependencies data☆56Updated this week
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆49Updated last month
- A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Sp…☆29Updated 3 years ago
- A tool for text normalisation via character-level machine translation☆13Updated 4 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- German Morphological Analyzer☆47Updated 3 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆154Updated 5 months ago
- A python library for easily querying morphological inflection models trained on Unimorph☆13Updated 2 years ago
- Treex NLP framework☆32Updated last week
- A simple configurable tool for manipulating dependency trees.☆13Updated 4 months ago
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆51Updated 2 years ago
- This packages up data for the Open Multilingual Wordnet☆48Updated this week
- Hungarian tokenizer.☆14Updated 3 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 3 years ago
- A NoSketch Engine Docker image which is easy to use☆19Updated 5 months ago