morfologik / polimorfologik
Scripts for preprocessing morfologik data.
☆39Updated 7 years ago
Alternatives and similar repositories for polimorfologik:
Users that are interested in polimorfologik are comparing it to the libraries listed below
- Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.☆188Updated last year
- Unitex/GramLab C++ Core☆22Updated last year
- Unitex/GramLab Language Resources☆20Updated 2 years ago
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 9 years ago
- A language detection Web Service☆52Updated 7 years ago
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆11Updated last year
- German part-of-speech dictionary☆43Updated last year
- Python port of Stempel, an algorithmic stemmer for Polish language.☆35Updated 4 months ago
- Polish morphological tagger.☆42Updated last year
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆66Updated last month
- ☆18Updated 9 years ago
- Program used to split text into segments☆25Updated 2 months ago
- Python lemmatizer for Polish.☆18Updated 5 years ago
- e-magyar text processing system -- inter-module communication via tsv + REST API☆28Updated last year
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- eXtensible Interlinear Glossed Text☆32Updated 2 years ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- Wrapper for DKPro Core to extract lingustic information from books.☆16Updated 2 years ago
- Thot toolkit for statistical machine translation☆50Updated 2 years ago
- The Zurich Dependency Parser for German☆82Updated 2 years ago
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆71Updated 9 months ago
- A bunch of fancy soft string matching routines, with some accompanying datasets☆56Updated 7 years ago
- WordNet-LMF formats☆21Updated last month
- A very simple python stemmer for Polish language based on Porter's Algorithm☆20Updated 7 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Command-line corpus tools☆9Updated 7 years ago
- Basic dataset for the linguistic data collection.☆15Updated 7 years ago
- Named Entity Recognition data for Europeana Newspapers☆171Updated last year
- NameTag: Named Entity Tagger☆38Updated 4 months ago
- KEA - Keyphrase Extraction Algorithm☆21Updated 8 years ago