snowballstem / snowball-dataLinks
Test data for snowball stemming algorithms
☆34Updated this week
Alternatives and similar repositories for snowball-data
Users that are interested in snowball-data are comparing it to the libraries listed below
Sorting:
- FreeLing project source code☆260Updated 2 years ago
- A fast and accurate POS and morphological tagging toolkit (EACL 2014)☆141Updated 5 years ago
- Snowball compiler and stemming algorithms☆808Updated this week
- 📖 Library that provides ways to read from and iterate through the Wikibase entities in a Wikibase Repository JSON dump☆72Updated last year
- Lexical database of any language☆184Updated 3 years ago
- Machine-readable lists of lemma-token pairs in 23 languages.☆343Updated 3 years ago
- List of common stop words in various languages.☆337Updated 3 years ago
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆161Updated 5 years ago
- Website source for snowballstem.org☆17Updated this week
- The curation repository for the data behind Concepticon.☆40Updated last week
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆69Updated 3 months ago
- All languages stopwords collection☆458Updated last year
- SCOWL (and friends).☆446Updated 2 months ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆76Updated 4 months ago
- Wiktionary parser tool for many language editions.☆54Updated 3 years ago
- Linguistica 5: Unsupervised Learning of Linguistic Structure☆30Updated 6 years ago
- CRF-based Morphological Tagging and Lemmatization☆37Updated 5 years ago
- Stopwords for 50 languages in JSON format☆433Updated 2 years ago
- The NLG tool for Finnish☆23Updated last year
- Various utilities for processing the data.☆212Updated last week
- Hunspell-based analysis for Elasticsearch☆84Updated 8 months ago
- NLTK Contrib☆166Updated last year
- The CMU Link Grammar natural language parser☆402Updated 2 weeks ago
- UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files☆386Updated 2 months ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆197Updated 5 years ago
- ElixirFM Functional Arabic Morphology☆44Updated 2 years ago
- Miscellaneous materials for teaching NLP using NLTK☆36Updated 7 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆35Updated 2 years ago
- Automatically exported from code.google.com/p/foma☆122Updated last month
- Machine translation for the real world☆23Updated 5 years ago