danielnaber / jwordsplitterView external linksLinks
small Java library for splitting German compound words
☆63May 13, 2024Updated last year
Alternatives and similar repositories for jwordsplitter
Users that are interested in jwordsplitter are comparing it to the libraries listed below
Sorting:
- ☆28Aug 4, 2015Updated 10 years ago
- Decompounding Plugin for Elasticsearch☆87Mar 2, 2021Updated 4 years ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Mar 15, 2017Updated 8 years ago
- Docs, notes and resources that don't fit elsewhere.☆13May 23, 2023Updated 2 years ago
- An unsupervised compound splitter☆42Oct 6, 2019Updated 6 years ago
- ☆11Dec 31, 2020Updated 5 years ago
- Hunspell analysis for ElasticSearch☆38Jan 20, 2012Updated 14 years ago
- Python API to WikiData☆32Oct 24, 2012Updated 13 years ago
- Preprocess text for NLP (tokenizing, lowercasing, stemming, sentence splitting, etc.)☆29Jun 7, 2011Updated 14 years ago
- Baseform lemmatization for Elasticsearch☆26Jun 7, 2019Updated 6 years ago
- ☆20Jun 29, 2017Updated 8 years ago
- Extract statistics from Wikipedia Dump files.☆26Aug 2, 2021Updated 4 years ago
- Will store links to known evaluation datasets alongside stats to characterize them☆24Mar 9, 2016Updated 9 years ago
- This is a mirror of SNU KKMA Korean Morpheme Analyzer v2.0.☆19Aug 22, 2014Updated 11 years ago
- Automatically exported from code.google.com/p/deepsyntacticparsing☆23Mar 19, 2015Updated 10 years ago
- German lemmatization with IWNLP as extension for spaCy☆26Jul 28, 2023Updated 2 years ago
- Fast and robust NLP components implemented in Java.☆53Oct 13, 2020Updated 5 years ago
- A parser and autocorrection tool for wiktionary.☆39Dec 4, 2015Updated 10 years ago
- A set of treebanks for multiple languages annotated in basic Stanford-style dependencies.☆68Aug 29, 2019Updated 6 years ago
- Recurrent Neural Network language modeling toolkit☆38Jan 23, 2014Updated 12 years ago
- Software and resources for natural language processing.☆132Jul 13, 2016Updated 9 years ago
- OWL verbalizer: making machine-readable knowledge also human-readable☆40Oct 1, 2023Updated 2 years ago
- A framework, data and configs for generating and building Tesseract OCR lang.traineddata model files, specifically for Japanese☆10Dec 9, 2013Updated 12 years ago
- Russian words synonyms and antonyms☆11Dec 7, 2021Updated 4 years ago
- A fast and accurate POS and morphological tagging toolkit (EACL 2014)☆149Feb 16, 2020Updated 5 years ago
- maximum entropy based part-of-speech tagger for NLTK☆45Dec 8, 2016Updated 9 years ago
- Train bilingual embeddings as described in our NAACL 2015 workshop paper "Bilingual Word Representations with Monolingual Quality in Mind…☆78Jun 15, 2019Updated 6 years ago
- R Package: ICD-10-GM Metadata☆11Sep 23, 2023Updated 2 years ago
- (Labeled) Latent Dirichlet Allocation on a sentence level with Gibbs Sampling☆10Mar 27, 2014Updated 11 years ago
- Vietnamese diacritics restoration☆14Jan 18, 2016Updated 10 years ago
- ☆10Jul 2, 2019Updated 6 years ago
- Focused Crawler for VT's CTRNet☆10May 13, 2013Updated 12 years ago
- "Save as DAISY" add-in for Microsoft Word☆10Dec 22, 2025Updated last month
- Pascal2 Harvest project QuEst☆14Sep 15, 2014Updated 11 years ago
- Data profiling tools for Big Data☆11Nov 17, 2025Updated 2 months ago
- Deploy a Ceramic daemon to AWS☆13Apr 18, 2023Updated 2 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- cross lingual text classification on amazon reviews☆10Nov 4, 2019Updated 6 years ago
- Speech ANDroid Apps☆20Jan 22, 2014Updated 12 years ago