danielnaber / jwordsplitter
small Java library for splitting German compound words
☆63Updated 11 months ago
Alternatives and similar repositories for jwordsplitter:
Users that are interested in jwordsplitter are comparing it to the libraries listed below
- ☆28Updated 9 years ago
- NEWS: JATE2.0 Beta.11 Released, see details below.☆81Updated last year
- Extension of the mate-tools NLP pipeline☆67Updated 9 years ago
- An unsupervised compound splitter☆41Updated 5 years ago
- DKPro JWPL (DKPro Java Wikipedia Library) is a free, Java-based application programming interface that facilitates access to all informat…☆85Updated last week
- Solr Query Segmenter for structuring unstructured queries☆21Updated 3 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Named Entity Recognition data for Europeana Newspapers☆171Updated 2 years ago
- Data files of German Decompounder for Apache Lucene / Apache Solr / Elasticsearch☆106Updated 3 years ago
- A Java Wikipedia markup to plain text converter☆37Updated 3 years ago
- Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.☆199Updated 5 months ago
- IXA pipes Named Entity Tagger (http://ixa2.si.ehu.es/ixa-pipes).☆32Updated 6 years ago
- Word and text similarity measures☆54Updated 2 years ago
- My implementation of Explicit Semantic Analysis (ESA) library that we used at KMi, Open University to produce our submission at the NTCIR…☆36Updated 9 years ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Updated 8 years ago
- Software and resources for natural language processing.☆131Updated 8 years ago
- Multi Tier Annotation Search☆26Updated 3 years ago
- A text tagger based on Lucene / Solr, using FST technology☆176Updated last year
- Open-source tools for morphological tagging, segmentation and stemming.☆40Updated 5 years ago
- German Morphological Analyzer☆47Updated 3 years ago
- Program used to split text into segments☆26Updated 6 months ago
- Will store links to known evaluation datasets alongside stats to characterize them☆24Updated 9 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 4 years ago
- This tool extracts word vectors from Lucene index.☆134Updated 7 years ago
- Wrapper for DKPro Core to extract lingustic information from books.☆16Updated 3 years ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 3 years ago
- N3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format☆70Updated 7 years ago
- A Utility Library for Wikipedia dumps☆33Updated 8 years ago
- A Named-Entity Recogniser based on Grobid.☆52Updated 7 months ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year