danielnaber / jwordsplitter
small Java library for splitting German compound words
☆62Updated 9 months ago
Alternatives and similar repositories for jwordsplitter:
Users that are interested in jwordsplitter are comparing it to the libraries listed below
- ☆28Updated 9 years ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- Multi Tier Annotation Search☆26Updated 3 years ago
- A Java Wikipedia markup to plain text converter☆37Updated 2 years ago
- An unsupervised compound splitter☆41Updated 5 years ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- Program used to split text into segments☆25Updated 3 months ago
- Named Entity Recognition data for Europeana Newspapers☆171Updated last year
- German Morphological Analyzer☆47Updated 3 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆51Updated 4 years ago
- NLP tools developed by Emory University.☆60Updated 8 years ago
- Solr Query Segmenter for structuring unstructured queries☆21Updated 3 years ago
- Open-source tools for morphological tagging, segmentation and stemming.☆41Updated 5 years ago
- Multi Tier Annotation Search☆12Updated 9 months ago
- A Utility Library for Wikipedia dumps☆33Updated 7 years ago
- DKPro Lab offers a workflow engine for parameter sweeping experiments.☆9Updated last year
- Yara K-Beam Arc-Eager Dependency Parser☆55Updated 8 years ago
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 9 years ago
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆17Updated this week
- NLP framework: sentence detector, tokeniser, pos-tagger and dependency parser☆49Updated last year
- Software and resources for natural language processing.☆131Updated 8 years ago
- Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.☆197Updated 3 months ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Updated 7 years ago
- Automatically exported from code.google.com/p/universal-pos-tags☆129Updated 2 years ago
- Stemmer for German☆45Updated 2 years ago
- Labeled examples from wiki dumps in Python☆67Updated 8 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…