LeonieWeissweiler / CISTEM
Stemmer for German
☆45Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for CISTEM
- German stopwords collection☆86Updated 2 years ago
- ☆18Updated this week
- A lemmatizer for German language text☆87Updated last year
- Compound splitter for German☆103Updated 4 years ago
- Plan and train German transformer models.☆23Updated 3 years ago
- Open German WordNet☆88Updated 9 months ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆451Updated 3 weeks ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated 3 months ago
- UIMA CAS processing library written in Python☆85Updated 6 months ago
- GermaParl: Corpus of Plenary Protocols of the German Bundestag (TEI Format)☆30Updated last year
- German part-of-speech dictionary☆43Updated last year
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tenso…☆235Updated 3 months ago
- Extended list of German stopwords for use in Web Projects, Search Engines or every thing else.☆101Updated 5 years ago
- Named Entity Recognition data for Europeana Newspapers☆173Updated last year
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- The Zurich Dependency Parser for German☆81Updated 2 years ago
- German sentiment scores with SentiWS as extension for spaCy☆36Updated last year
- A machine learning tool for fishing entities☆248Updated last week
- The Hanover Tagger - A simple approach to lemmatization and POS-tagging of German morphology based on heuristics and hidden markov models…☆47Updated last year
- small Java library for splitting German compound words☆62Updated 6 months ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Detect and align similar passages☆88Updated 2 months ago
- Schema for modelling parliamentary debates☆21Updated 2 years ago
- ParlaMint: Comparable Parliamentary Corpora☆50Updated last month
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆22Updated last year
- Named entity annotation tool☆27Updated last year
- German Morphological Analyzer☆47Updated 3 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆36Updated 2 years ago
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern stri…☆22Updated 2 years ago