small Java library for splitting German compound words
☆63May 13, 2024Updated last year
Alternatives and similar repositories for jwordsplitter
Users that are interested in jwordsplitter are comparing it to the libraries listed below
Sorting:
- ☆28Aug 4, 2015Updated 10 years ago
- Decompounding Plugin for Elasticsearch☆87Mar 2, 2021Updated 5 years ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Mar 15, 2017Updated 8 years ago
- Docs, notes and resources that don't fit elsewhere.☆13May 23, 2023Updated 2 years ago
- Node interface which parses sentences into grammatical structures☆12May 31, 2017Updated 8 years ago
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆12Aug 10, 2023Updated 2 years ago
- Data files of German Decompounder for Apache Lucene / Apache Solr / Elasticsearch☆111Sep 13, 2021Updated 4 years ago
- An unsupervised compound splitter☆42Oct 6, 2019Updated 6 years ago
- MicroRestD is a small C++11 cross-platform REST server built on top of libmicrohttpd http://www.gnu.org/software/libmicrohttpd/.☆13Jan 28, 2026Updated last month
- ☆11Dec 31, 2020Updated 5 years ago
- Hunspell analysis for ElasticSearch☆38Jan 20, 2012Updated 14 years ago
- Sense Disambiguation of Connectives for PDTB-Style Discourse Parsing☆14Jan 13, 2017Updated 9 years ago
- Python API to WikiData☆32Oct 24, 2012Updated 13 years ago
- Preprocess text for NLP (tokenizing, lowercasing, stemming, sentence splitting, etc.)☆29Jun 7, 2011Updated 14 years ago
- Baseform lemmatization for Elasticsearch☆26Jun 7, 2019Updated 6 years ago
- ☆20Jun 29, 2017Updated 8 years ago
- Extract statistics from Wikipedia Dump files.☆26Aug 2, 2021Updated 4 years ago
- Will store links to known evaluation datasets alongside stats to characterize them☆24Mar 9, 2016Updated 9 years ago
- Supervised learning of morphology☆28Jan 17, 2017Updated 9 years ago
- Automatically exported from code.google.com/p/deepsyntacticparsing☆23Mar 19, 2015Updated 10 years ago
- German lemmatization with IWNLP as extension for spaCy☆26Jul 28, 2023Updated 2 years ago
- A little text processing library for Scala.☆28Mar 3, 2016Updated 10 years ago
- Fast and robust NLP components implemented in Java.☆53Oct 13, 2020Updated 5 years ago
- A parser and autocorrection tool for wiktionary.☆39Dec 4, 2015Updated 10 years ago
- A Utility Library for Wikipedia dumps☆33Feb 24, 2017Updated 9 years ago
- Recurrent Neural Network language modeling toolkit☆38Jan 23, 2014Updated 12 years ago
- Software and resources for natural language processing.☆132Jul 13, 2016Updated 9 years ago
- The Zurich Dependency Parser for German☆89Aug 27, 2025Updated 6 months ago
- A LibreOffice extension that converts JabRef references to plain text code and vice versa so that you can use your references with MS Off…☆12Aug 15, 2024Updated last year
- Russian words synonyms and antonyms☆11Dec 7, 2021Updated 4 years ago
- maximum entropy based part-of-speech tagger for NLTK☆45Dec 8, 2016Updated 9 years ago
- A fast and accurate POS and morphological tagging toolkit (EACL 2014)☆149Feb 16, 2020Updated 6 years ago
- Train bilingual embeddings as described in our NAACL 2015 workshop paper "Bilingual Word Representations with Monolingual Quality in Mind…☆79Jun 15, 2019Updated 6 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- Data profiling tools for Big Data☆11Nov 17, 2025Updated 3 months ago
- Vietnamese diacritics restoration☆14Jan 18, 2016Updated 10 years ago
- Madek main web interface☆21Updated this week
- (Labeled) Latent Dirichlet Allocation on a sentence level with Gibbs Sampling☆10Mar 27, 2014Updated 11 years ago
- Deploy a Ceramic daemon to AWS☆13Apr 18, 2023Updated 2 years ago