Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.
☆242Mar 19, 2026Updated 3 weeks ago
Alternatives and similar repositories for GermanWordEmbeddings
Users that are interested in GermanWordEmbeddings are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A lemmatizer for German language text☆94Feb 7, 2023Updated 3 years ago
- Language Model and Text Classification for German Language using Deep Learning☆18Jun 15, 2018Updated 7 years ago
- This is a german ELMo deep contextualized word representation. It is trained on a special German Wikipedia Text Corpus.☆28Dec 15, 2019Updated 6 years ago
- Any contributions to the NLTK project☆29May 8, 2014Updated 11 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆518Oct 30, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- German sentiment scores with SentiWS as extension for spaCy☆38Nov 26, 2022Updated 3 years ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Dec 16, 2023Updated 2 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆87Nov 7, 2022Updated 3 years ago
- Transformer language model (GPT-2) with sentencepiece tokenizer☆10Oct 15, 2019Updated 6 years ago
- Parser für die Plenarprotokolle des Bundestags☆21Jul 17, 2017Updated 8 years ago
- The Potsdam Twitter Sentiment Corpus☆18Jan 15, 2020Updated 6 years ago
- GermaParl: Corpus of Plenary Protocols of the German Bundestag (TEI Format)☆38Jun 1, 2023Updated 2 years ago
- German lemmatization with IWNLP as extension for spaCy☆27Jul 28, 2023Updated 2 years ago
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern stri…☆35Jul 7, 2022Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- GermaParl R Data Package☆14Aug 31, 2022Updated 3 years ago
- GermaNet API for Python☆54Mar 8, 2018Updated 8 years ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆159Dec 6, 2022Updated 3 years ago
- Plan and train German transformer models.☆23Feb 22, 2021Updated 5 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…☆21Mar 20, 2026Updated 3 weeks ago
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆168Dec 29, 2024Updated last year
- Poems retrieval demo built with GNES framework☆14Oct 3, 2019Updated 6 years ago
- German language support for TextBlob.☆102Jan 7, 2025Updated last year
- A Python library for topic modeling and visualization☆67Sep 20, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- This is a prototype of a semi-automatic data anonymization app for German documents. ➡️ The project has moved to: https://gitlab.opencode…☆24Mar 20, 2026Updated 3 weeks ago
- ☆12Jan 27, 2026Updated 2 months ago
- A Dataset of German Legal Documents for Named Entity Recognition☆177Oct 19, 2022Updated 3 years ago
- Compound splitter for German☆113Apr 5, 2020Updated 6 years ago
- I analysed online user comments on articles by German news publishers SPON, ZEIT, and Focus☆19Feb 3, 2018Updated 8 years ago
- An R data package containing georeferenced events of right-wing violence in Germany from 2014 onwards☆11Jun 27, 2018Updated 7 years ago
- Automatic Limerick Generation☆11Mar 18, 2021Updated 5 years ago
- German GPT-2 model☆32Aug 17, 2021Updated 4 years ago
- An unsupervised compound splitter☆42Oct 6, 2019Updated 6 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction☆13Apr 21, 2020Updated 5 years ago
- This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings l…☆23Feb 22, 2022Updated 4 years ago
- Presentations & notebooks from our talks /workshops/meetups/etc☆24Mar 23, 2018Updated 8 years ago
- Watset: Automatic Induction of Synsets from a Graph of Synonyms☆16Jul 7, 2019Updated 6 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆13Aug 10, 2023Updated 2 years ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Mar 8, 2022Updated 4 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago