morfologik / polimorfologik
Scripts for preprocessing morfologik data.
☆39Updated 6 years ago
Related projects: ⓘ
- Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.☆187Updated last year
- Python port of Stempel, an algorithmic stemmer for Polish language.☆34Updated 3 weeks ago
- Polish morphological tagger.☆43Updated last year
- ☆18Updated 9 years ago
- Morfologik Polish Lemmatizer plugin for Elasticsearch☆83Updated last week
- Python lemmatizer for Polish.☆18Updated 4 years ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆80Updated 6 years ago
- Stemmer for German☆45Updated 2 years ago
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆70Updated 5 months ago
- small Java library for splitting German compound words☆62Updated 4 months ago
- German part-of-speech dictionary☆42Updated last year
- HerBERT is a BERT-based Language Model trained on Polish Corpora using only MLM objective with dynamic masking of whole words.☆64Updated 2 years ago
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 8 years ago
- Test data for snowball stemming algorithms☆29Updated 2 weeks ago
- Program used to split text into segments☆25Updated last year
- MorphoDiTa: Morphologic Dictionary and Tagger☆69Updated 10 months ago
- A language detection Web Service☆52Updated 7 years ago
- 💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython☆11Updated 4 years ago
- Polish data.☆11Updated 4 months ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆60Updated 4 months ago
- [obsolete] Python interface to Morfeusz☆10Updated 7 years ago
- A simple proof of concept levenshtein automaton in Python☆107Updated 8 years ago
- A fast and comprehensive Java library capable of performing automaton and non-automaton based Levenshtein distance determination and neig…☆41Updated 11 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆85Updated 8 months ago
- eXtensible Interlinear Glossed Text☆31Updated 2 years ago
- extJWNL (Extended Java WordNet Library) is a Java API for creating, reading and updating dictionaries in WordNet format.☆124Updated 6 months ago
- Deutsch Language Tool Kit☆12Updated 9 years ago
- Resources for doing NLP in Polish☆44Updated 4 years ago
- German lemmatization with IWNLP as extension for spaCy☆23Updated last year
- SimpleNLG-EnFr 1.1 is a bilingual English/French adaption of SimpleNLG v4.2☆25Updated 6 years ago