snowballstem / snowball-dataLinks
Test data for snowball stemming algorithms
☆34Updated 3 months ago
Alternatives and similar repositories for snowball-data
Users that are interested in snowball-data are comparing it to the libraries listed below
Sorting:
- 📖 Library that provides ways to read from and iterate through the Wikibase entities in a Wikibase Repository JSON dump☆74Updated last year
- A Javascript Implementation of the Porter Stemmer☆96Updated 3 years ago
- A fast and accurate POS and morphological tagging toolkit (EACL 2014)☆141Updated 5 years ago
- WordNet in JSON format.☆92Updated 5 years ago
- Machine-readable Wiktionary☆78Updated last year
- Website source for snowballstem.org☆17Updated 3 weeks ago
- Miscellaneous materials for teaching NLP using NLTK☆37Updated 7 years ago
- The NLG tool for Finnish☆23Updated last year
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆35Updated 2 years ago
- Translation of query languages to serialized KoralQuery protocol☆12Updated this week
- Open morphology for Finnish☆93Updated 3 weeks ago
- Web service for implementing a large-scale translation memory☆90Updated 4 years ago
- Wiktionary parser tool for many language editions.☆54Updated 3 years ago
- FreeLing project source code☆259Updated 2 years ago
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago
- hand-written dictionaries from the FreeDict project☆437Updated 2 months ago
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognition☆136Updated 9 years ago
- Transliteration package for Indian scripts☆16Updated 8 years ago
- A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detector☆252Updated 7 years ago
- Solrstrap is a Query-Result interface for Solr written in JavaScript, HTML and CSS☆87Updated 8 years ago
- ElixirFM Functional Arabic Morphology☆44Updated 2 years ago
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆161Updated 4 years ago
- Official releases of the PROIEL treebank of ancient Indo-European languages☆37Updated 2 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆69Updated 2 months ago
- Resources for conservation, development, and documentation of low resource (human) languages.☆424Updated 5 months ago
- CiteSeerX public repository☆133Updated last year
- Lexical database of any language☆182Updated 3 years ago
- Python stemming library using snowball stemmers☆264Updated last month
- Dockerized version of Google's SyntaxNet Parser and POS tagger.☆42Updated 7 years ago
- Decompounding Plugin for Elasticsearch☆87Updated 4 years ago