morfologik / polimorfologik
Scripts for preprocessing morfologik data.
☆39Updated 7 years ago
Alternatives and similar repositories for polimorfologik:
Users that are interested in polimorfologik are comparing it to the libraries listed below
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆71Updated 10 months ago
- Polish morphological tagger.☆43Updated last year
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆61Updated 9 months ago
- Python port of Stempel, an algorithmic stemmer for Polish language.☆35Updated 5 months ago
- CRF-based Morphological Tagging and Lemmatization☆36Updated 5 years ago
- Hy-phen-ation made easy☆207Updated this week
- eXtensible Interlinear Glossed Text☆32Updated 2 years ago
- Python library implementing the ISO/IEC 26300 OpenDocument Format standard (ODF)☆53Updated 4 years ago
- Automatically exported from code.google.com/p/foma☆122Updated 7 months ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 7 years ago
- Baseform lemmatization for Elasticsearch☆26Updated 5 years ago
- German part-of-speech dictionary☆43Updated last year
- Stemmer for German☆45Updated 2 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆64Updated this week
- extJWNL (Extended Java WordNet Library) is a Java API for creating, reading and updating dictionaries in WordNet format.☆128Updated 11 months ago
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago
- KEA - Keyphrase Extraction Algorithm☆22Updated 8 years ago
- Python port for IWNLP.Lemmatizer☆17Updated last year
- Translation of query languages to serialized KoralQuery protocol☆11Updated this week
- A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Sp…☆29Updated 3 years ago
- An index data structure for approximate string search.☆23Updated 5 years ago
- Deutsch Language Tool Kit☆12Updated 9 years ago
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 9 years ago
- Open morphology for Finnish☆87Updated last month
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- XQuery wrapper around the Stanford CoreNLP pipeline☆13Updated last year
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆67Updated 2 weeks ago
- Command-line corpus tools☆9Updated 7 years ago