danielnaber / jwordsplitterLinks
small Java library for splitting German compound words
☆63Updated last year
Alternatives and similar repositories for jwordsplitter
Users that are interested in jwordsplitter are comparing it to the libraries listed below
Sorting:
- ☆28Updated 9 years ago
- An unsupervised compound splitter☆41Updated 5 years ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- Program used to split text into segments☆26Updated 7 months ago
- Named Entity Recognition data for Europeana Newspapers☆171Updated 2 years ago
- A Java Wikipedia markup to plain text converter☆37Updated 3 years ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Updated 8 years ago
- German Morphological Analyzer☆47Updated 3 years ago
- Yara K-Beam Arc-Eager Dependency Parser☆56Updated 9 years ago
- Extension of the mate-tools NLP pipeline☆67Updated 9 years ago
- Multi Tier Annotation Search☆26Updated 4 years ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- A Utility Library for Wikipedia dumps☆33Updated 8 years ago
- N3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format☆70Updated 7 years ago
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆72Updated last year
- Automatically exported from code.google.com/p/universal-pos-tags☆129Updated 2 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆12Updated last year
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 9 years ago
- DKPro JWPL (DKPro Java Wikipedia Library) is a free, Java-based application programming interface that facilitates access to all informat…☆86Updated last week
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.☆198Updated 6 months ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆68Updated 3 months ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 4 years ago
- morphologically informed POS tagging for German☆25Updated 3 years ago
- IXA pipes Named Entity Tagger (http://ixa2.si.ehu.es/ixa-pipes).☆32Updated 6 years ago
- Automatically exported from code.google.com/p/berkeleylm☆98Updated 9 years ago
- The Zurich Dependency Parser for German☆85Updated 2 years ago
- Automatically exported from code.google.com/p/deepsyntacticparsing☆23Updated 10 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆63Updated last year
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆17Updated last week