ppke-nlpg / purepos
PurePos is an open source hybrid morphological tagger.
☆16Updated 4 years ago
Alternatives and similar repositories for purepos:
Users that are interested in purepos are comparing it to the libraries listed below
- Parser for KAF NAF files written in Python☆16Updated 3 years ago
- A tool for automatic spelling normalization☆20Updated 4 years ago
- spaCy-to-naf converter☆21Updated 7 months ago
- WordNet Domains, WordNet Affect and SentiWords☆49Updated 9 years ago
- e-magyar text processing system -- inter-module communication via tsv + REST API☆28Updated last year
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆12Updated last year
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆47Updated last month
- This is an open-source sentiment analysis tool for Hungarian language, written in Python.☆11Updated 8 years ago
- Multi Tier Annotation Search☆26Updated 3 years ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Updated 7 years ago
- The home repository of the NerKor corpus, a Hungarian gold standard named entity annotated corpus containing 1 million tokens.☆15Updated last year
- German Morphological Analyzer☆47Updated 3 years ago
- Recognition Models for Kraken and CLSTM☆13Updated 5 years ago
- eXternally configurable REference and Non Named Entity Recognizer☆17Updated 7 months ago
- TweetCaT - a tool for building Twitter corpora of smaller languages or specific geographical regions☆12Updated 7 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆60Updated 8 months ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆111Updated this week
- Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".☆64Updated 9 years ago
- Thot toolkit for statistical machine translation☆50Updated 2 years ago
- Measure the similarity of text corpora for 74 languages☆13Updated 11 months ago
- A python module to process data for Frame Semantic Parsing☆23Updated 4 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆66Updated last month
- bilingual dictionary extractor from parallel corpora☆22Updated 10 years ago
- The Community-enRiched Open WordNet (CROWN)☆19Updated 9 years ago
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆50Updated last year
- A character-wise tokenizer for morphologically rich languages☆27Updated last month
- Bilingual sentence similarity classifier using Tensorflow☆19Updated 5 years ago
- A simple configurable tool for manipulating dependency trees.☆13Updated 3 weeks ago