jfilter / german-preprocessing
🇩🇪 Preprocess German texts to do some serious natural-language processing.
☆11Updated 2 years ago
Alternatives and similar repositories for german-preprocessing:
Users that are interested in german-preprocessing are comparing it to the libraries listed below
- Stand-off Text Annotation Model (STAM) is a data model for stand-off-text annotation where any information on a text is represented as an…☆18Updated 3 months ago
- German language support for TextBlob.☆103Updated last month
- A lemmatizer for German language text☆88Updated 2 years ago
- Open morphology for Finnish☆87Updated last month
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆149Updated 2 months ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆466Updated 4 months ago
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆17Updated last week
- TEI Reader Python Library☆17Updated last year
- A NoSketch Engine Docker image which is easy to use☆19Updated 3 months ago
- A software to detect text reuse with BLAST.☆14Updated 5 years ago
- ☆18Updated last month
- Helsinki Finite-State Technology (library and application suite)☆128Updated last week
- The Hanover Tagger - A simple approach to lemmatization and POS-tagging of German morphology based on heuristics and hidden markov models…☆51Updated 2 years ago
- A Python library for topic modeling and visualization☆65Updated 4 years ago
- XSLT stylesheets to convert TEI to HTML and ePub format.☆39Updated this week
- An Ancient Greek Morphology Tagger☆26Updated last year
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern stri…☆23Updated 2 years ago
- GerVADER - A German adaptation of the VADER sentiment analysis tool for social media texts☆25Updated 2 years ago
- German sentiment scores with SentiWS as extension for spaCy☆36Updated 2 years ago
- Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tenso…☆236Updated 6 months ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆413Updated last month
- Lexical data at Unicode☆67Updated 6 months ago
- German part-of-speech dictionary☆43Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆153Updated 3 months ago
- ☆10Updated last month
- Deutsches Lyrik Korpus (DLK) / German Poetry Corpus☆18Updated 9 months ago
- Morphological analyzer and lemmatizer for Latin.☆26Updated last month
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- High-performance text aligner for large collections of texts☆49Updated 4 months ago
- A general-purpose NLP pipeline for Ancient Greek☆20Updated 11 months ago