voikko / corevoikko
Libvoikko and essential linguistic resources
☆93Updated this week
Related projects ⓘ
Alternatives and complementary repositories for corevoikko
- Open morphology for Finnish☆84Updated last month
- Read-only mirror of https://framagit.org/tuxor1337/dictmaster Pull requests and issues on GitHub cannot be accepted and will be automat…☆32Updated last year
- enchant spellchecking library☆347Updated last month
- Scripts for preprocessing morfologik data.☆39Updated 6 years ago
- SpedeScript ohjelmointikieli☆29Updated 6 years ago
- ☆243Updated 2 months ago
- Lemmatiser for Danish, Dutch, English, German, Polish, Romanian, Russian and tens of other languages, that uses affix rules (affix: prefi…☆35Updated 3 months ago
- A set of tools to build, maintain and use translation memories☆30Updated this week
- unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language prefere…☆69Updated 2 years ago
- A simple converter from OpenDocument Text to plain text☆84Updated 5 years ago
- A Python library to parse MediaWiki WikiText☆290Updated last month
- HS julkaisee Suomen koronavirustartunnat avoimena datana.☆100Updated 2 years ago
- German part-of-speech dictionary☆43Updated last year
- Helsinki Finite-State Technology (library and application suite)☆123Updated this week
- The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:☆49Updated 6 years ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆44Updated 3 weeks ago
- eXtensible Interlinear Glossed Text☆31Updated 2 years ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆112Updated 6 months ago
- ☆6Updated 5 years ago
- This repo provides a python module to work with Open Dutch WordNet. It was created using python 3.4.☆64Updated 3 years ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- A character-wise tokenizer for morphologically rich languages☆27Updated 5 months ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 3 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated 3 months ago
- Convert text so that British spellings are swapped with their Americanized form or vice versa.☆30Updated 2 years ago
- Unofficial Python library for using the Polish Wordnet (plWordNet / Słowosieć)☆20Updated last year
- Open source tools for Estonian natural language processing☆114Updated last week
- A Python library for working with and comparing language codes.☆339Updated 7 months ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆62Updated 2 months ago
- Pure Python spell-checker, (almost) full port of Hunspell☆284Updated 7 months ago