snowballstem / snowball-data
Test data for snowball stemming algorithms
β31Updated this week
Alternatives and similar repositories for snowball-data:
Users that are interested in snowball-data are comparing it to the libraries listed below
- Website source for snowballstem.orgβ17Updated this week
- Snowball compiler and stemming algorithmsβ774Updated this week
- π Library that provides ways to read from and iterate through the Wikibase entities in a Wikibase Repository JSON dumpβ72Updated 7 months ago
- Lexical database of any languageβ176Updated 2 years ago
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognitionβ136Updated 8 years ago
- enchant spellchecking libraryβ357Updated last month
- NLTK Websiteβ62Updated 6 months ago
- A Javascript Implementation of the Porter Stemmerβ96Updated 3 years ago
- Simple Python Wrapper around MediaWiki APIβ30Updated 2 years ago
- ElixirFM Functional Arabic Morphologyβ43Updated last year
- An offline/online field database which adapts to its user's terminology and I-Language. http://fielddb.github.ioβ79Updated 2 years ago
- Python stemming library using snowball stemmersβ249Updated 4 months ago
- Software and resources for natural language processing.β131Updated 8 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.β89Updated last year
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic prβ¦β67Updated last week
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (incluβ¦β61Updated 9 months ago
- Arabic roots list resourceβ10Updated 6 years ago
- LingPy: Python library for quantitative tasks in historical linguisticsβ128Updated last year
- Unitex/GramLab Language Resourcesβ20Updated 2 years ago
- A web framework to display Cross Linguistic Linked Data.β55Updated this week
- free French treebankβ32Updated 8 years ago
- Detect the language of textβ36Updated 4 years ago
- Analyze standard numbers like ARK, DOI, EAN, GTIN, IBAN, ISAN, ISBN, ISMN, ISNI, ISSN, ISTC, ISWC, ORCID, PPN, SICI, UPC, ZDB with Elastiβ¦β24Updated 8 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.β75Updated 3 years ago
- A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detectorβ251Updated 7 years ago
- displaCy-ent.js: An open-source named entity visualiser for the modern webβ199Updated 6 years ago
- Web service for implementing a large-scale translation memoryβ90Updated 3 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.β34Updated last year
- Basic dataset for the linguistic data collection.β15Updated 8 years ago
- Linguistica 5: Unsupervised Learning of Linguistic Structureβ30Updated 5 years ago