morfologik / polimorfologikLinks
Scripts for preprocessing morfologik data.
☆40Updated 7 years ago
Alternatives and similar repositories for polimorfologik
Users that are interested in polimorfologik are comparing it to the libraries listed below
Sorting:
- Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.☆192Updated last year
- Morfologik Polish Lemmatizer plugin for Elasticsearch☆89Updated last month
- Polish morphological tagger.☆43Updated 2 years ago
- Python port of Stempel, an algorithmic stemmer for Polish language.☆38Updated 9 months ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 8 years ago
- small Java library for splitting German compound words☆63Updated last year
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- Stemmer for German☆45Updated 3 years ago
- Elasticsearch lemmatizer for 15 languages☆106Updated 6 months ago
- Java Wiktionary Library☆57Updated 2 years ago
- Program used to split text into segments☆26Updated 7 months ago
- MorphoDiTa: Morphologic Dictionary and Tagger☆73Updated last year
- NameTag: Named Entity Tagger☆38Updated 9 months ago
- Official releases of the PROIEL treebank of ancient Indo-European languages☆36Updated 2 years ago
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆11Updated last year
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆72Updated last year
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆127Updated 5 months ago
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 9 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆68Updated 4 months ago
- ☆18Updated 9 years ago
- e-magyar text processing system -- inter-module communication via tsv + REST API☆29Updated last month
- WordNet RDF export☆24Updated 7 years ago
- Automatically exported from code.google.com/p/foma☆122Updated 3 months ago
- Helsinki Finite-State Technology (library and application suite)☆130Updated last week
- A fast and comprehensive Java library capable of performing automaton and non-automaton based Levenshtein distance determination and neig…☆42Updated 12 years ago
- Python lemmatizer for Polish.☆18Updated 5 years ago
- Various utilities regarding Levenshtein transducers. (Java)☆57Updated 3 years ago
- German part-of-speech dictionary☆45Updated last year