morfologik / polimorfologik
Scripts for preprocessing morfologik data.
☆39Updated 7 years ago
Alternatives and similar repositories for polimorfologik:
Users that are interested in polimorfologik are comparing it to the libraries listed below
- Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.☆189Updated last year
- Polish morphological tagger.☆43Updated last year
- A language detection Web Service☆53Updated 7 years ago
- Python port of Stempel, an algorithmic stemmer for Polish language.☆36Updated 7 months ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- This is a Fact based Question Answering System using Apache Solr as backend search engine, Wikipedia dumps as information source, Apache …☆26Updated 2 years ago
- Simple Hungarian Sentence Analysis with NLTK☆16Updated 4 years ago
- ☆50Updated 4 years ago
- German part-of-speech dictionary☆43Updated last year
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- Python port for IWNLP.Lemmatizer☆17Updated last year
- Lemmatiser for Danish, Dutch, English, German, Polish, Romanian, Russian and tens of other languages, that uses affix rules (affix: prefi…☆36Updated last month
- A tokenizer for Icelandic text☆28Updated 6 months ago
- ☆17Updated last month
- Program used to split text into segments☆25Updated 5 months ago
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 9 years ago
- Solr Query Segmenter for structuring unstructured queries☆21Updated 3 years ago
- WordNet-LMF formats☆21Updated last month
- Stemmer for German☆45Updated 2 years ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 8 years ago
- A text tagger based on Lucene / Solr, using FST technology☆176Updated last year
- Unitex/GramLab Language Resources☆19Updated 2 years ago
- A Python port of the Apache Lucene ASCII Folding Filter that converts alphabetic, numeric, and symbolic Unicode characters which are not …☆15Updated 4 years ago
- Baseform lemmatization for Elasticsearch☆26Updated 5 years ago
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆71Updated 11 months ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆63Updated 10 months ago
- Resources for doing NLP in Polish☆47Updated 5 years ago
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆11Updated last year
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated 3 years ago
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago