nytud / emtsvLinks
e-magyar text processing system -- inter-module communication via tsv + REST API
☆29Updated 3 months ago
Alternatives and similar repositories for emtsv
Users that are interested in emtsv are comparing it to the libraries listed below
Sorting:
- A curated list of NLP resources for Hungarian☆250Updated last week
- HuSpaCy: industrial-strength Hungarian natural language processing☆169Updated 8 months ago
- All languages stopwords collection☆451Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆167Updated last month
- ✔️Contextual word checker for better suggestions (not actively maintained)☆415Updated 5 months ago
- A modern, interlingual wordnet interface for Python☆255Updated 2 weeks ago
- Open German WordNet☆96Updated last year
- Faster, modernized fork of the language identification tool langid.py☆56Updated 8 months ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆249Updated 2 years ago
- The home repository of the NerKor corpus, a Hungarian gold standard named entity annotated corpus containing 1 million tokens.☆15Updated last year
- Pipeline to generate the Standardized Project Gutenberg Corpus☆190Updated last year
- ☆18Updated last month
- UIMA CAS processing library written in Python☆90Updated last month
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆260Updated 10 months ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆113Updated last year
- 🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/inception 👈 -- ⚠️ The official WebAnno repository has reached the…☆247Updated 2 years ago
- German Morphological Analyzer☆47Updated 3 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆105Updated 2 months ago
- A character-wise tokenizer for morphologically rich languages☆27Updated 4 months ago
- Sentiment Corpus for Swedish 🇸🇪 Norwegian 🇳🇴 Danish 🇩🇰 Finnish 🇫🇮 (and English 🏴)☆15Updated 4 years ago
- 🔢 Work with static vector models☆28Updated 3 months ago
- This packages up data for the Open Multilingual Wordnet☆50Updated last month
- A tokenizer and sentence splitter for German and English web and social media texts.☆147Updated 7 months ago
- now you can even use apertium from python☆33Updated last year
- NLP framework: sentence detector, tokeniser, pos-tagger and dependency parser☆50Updated last month
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆490Updated 8 months ago
- A multilingual parallel corpus created from translations of the Bible.☆182Updated 2 months ago
- German language support for TextBlob.☆103Updated 6 months ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆31Updated 4 months ago
- Text tokenization and sentence segmentation (segtok v2)☆205Updated 3 years ago