Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.
☆242Jun 11, 2026Updated this week
Alternatives and similar repositories for GermanWordEmbeddings
Users that are interested in GermanWordEmbeddings are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Any contributions to the NLTK project☆29May 8, 2014Updated 12 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆524Oct 30, 2024Updated last year
- Python port for IWNLP.Lemmatizer☆19Apr 13, 2026Updated 2 months ago
- German sentiment scores with SentiWS as extension for spaCy☆38Apr 13, 2026Updated 2 months ago
- Ten Thousand German News Articles Dataset for Topic Classification☆87Nov 7, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Transformer language model (GPT-2) with sentencepiece tokenizer☆10Oct 15, 2019Updated 6 years ago
- Parser für die Plenarprotokolle des Bundestags☆20Jul 17, 2017Updated 8 years ago
- The Potsdam Twitter Sentiment Corpus☆18Jan 15, 2020Updated 6 years ago
- GermaParl: Corpus of Plenary Protocols of the German Bundestag (TEI Format)☆38Jun 1, 2023Updated 3 years ago
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern stri…☆35Jul 7, 2022Updated 3 years ago
- GermaNet API for Python☆54Mar 8, 2018Updated 8 years ago
- Coreference resolution for German☆16Jun 26, 2017Updated 8 years ago
- IWNLP: A parser for the German edition of Wiktionary☆13Jul 28, 2023Updated 2 years ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆158Dec 6, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Plan and train German transformer models.☆23Feb 22, 2021Updated 5 years ago
- Fine-tuned Transformers compatible BERT models for Sequence Tagging☆40Jul 17, 2020Updated 5 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…☆21Mar 20, 2026Updated 2 months ago
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆174Dec 29, 2024Updated last year
- German language support for TextBlob.☆103Jan 7, 2025Updated last year
- ☆12Jan 27, 2026Updated 4 months ago
- A Dataset of German Legal Documents for Named Entity Recognition☆178Oct 19, 2022Updated 3 years ago
- Compound splitter for German☆112Apr 5, 2020Updated 6 years ago
- I analysed online user comments on articles by German news publishers SPON, ZEIT, and Focus☆19Feb 3, 2018Updated 8 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An R data package containing georeferenced events of right-wing violence in Germany from 2014 onwards☆11Jun 27, 2018Updated 7 years ago
- Automatic Limerick Generation☆11Mar 18, 2021Updated 5 years ago
- German GPT-2 model☆32Aug 17, 2021Updated 4 years ago
- An unsupervised compound splitter☆41Oct 6, 2019Updated 6 years ago
- [ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction☆13Apr 21, 2020Updated 6 years ago
- This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings l…☆23Feb 22, 2022Updated 4 years ago
- Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources☆12Apr 12, 2018Updated 8 years ago
- Annotated data set consisting of user comments posted to a German-language newspaper website☆18Jun 28, 2018Updated 7 years ago
- Presentations & notebooks from our talks /workshops/meetups/etc☆24Mar 23, 2018Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Watset: Automatic Induction of Synsets from a Graph of Synonyms☆16Jul 7, 2019Updated 6 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆13Aug 10, 2023Updated 2 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆152Dec 9, 2024Updated last year
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Mar 8, 2022Updated 4 years ago
- Simple CORPORA list crawler☆11Dec 2, 2016Updated 9 years ago
- Slurk (think “slack for mechanical turk”…) is a lightweight and easily extensible chat server built especially for conducting multimodal …☆15Dec 8, 2023Updated 2 years ago
- Language features used in the NELA Toolkit and other news studies☆13Oct 14, 2020Updated 5 years ago