devmount/GermanWordEmbeddings

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/devmount/GermanWordEmbeddings)

devmount / GermanWordEmbeddings

Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.

☆242

Alternatives and similar repositories for GermanWordEmbeddings

Users that are interested in GermanWordEmbeddings are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WZBSocialScienceCenter / germalemma
View on GitHub
A lemmatizer for German language text
☆95Feb 7, 2023Updated 3 years ago
Bachfischer / german2vec
View on GitHub
Language Model and Text Classification for German Language using Deep Learning
☆18Jun 15, 2018Updated 8 years ago
ptnplanet / NLTK-Contributions
View on GitHub
Any contributions to the NLTK project
☆29May 8, 2014Updated 12 years ago
Liebeck / spacy-sentiws
View on GitHub
German sentiment scores with SentiWS as extension for spaCy
☆38Apr 13, 2026Updated 2 months ago
tudarmstadt-lt / GermaNER
View on GitHub
GermaNER: Free Open German Named Entity Recognition Tool
☆38Dec 16, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
tblock / 10kGNAD
View on GitHub
Ten Thousand German News Articles Dataset for Topic Classification
☆87Nov 7, 2022Updated 3 years ago
gooofy / transformer-lm
View on GitHub
Transformer language model (GPT-2) with sentencepiece tokenizer
☆10Oct 15, 2019Updated 6 years ago
bundestag / plpr-scraper
View on GitHub
Parser für die Plenarprotokolle des Bundestags
☆20Jul 17, 2017Updated 8 years ago
WladimirSidorenko / PotTS
View on GitHub
The Potsdam Twitter Sentiment Corpus
☆18Jan 15, 2020Updated 6 years ago
PolMine / GermaParlTEI
View on GitHub
GermaParl: Corpus of Plenary Protocols of the German Bundestag (TEI Format)
☆39Jun 1, 2023Updated 3 years ago
Liebeck / spacy-iwnlp
View on GitHub
German lemmatization with IWNLP as extension for spaCy
☆27Apr 13, 2026Updated 2 months ago
PolMine / GermaParl
View on GitHub
GermaParl R Data Package
☆14Aug 31, 2022Updated 3 years ago
Liebeck / IWNLP
View on GitHub
IWNLP: A parser for the German edition of Wiktionary
☆13Jul 28, 2023Updated 2 years ago
dbmdz / berts
View on GitHub
DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models
☆158Dec 6, 2022Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
stefan-it / fine-tuned-berts-seq
View on GitHub
Fine-tuned Transformers compatible BERT models for Sequence Tagging
☆40Jul 17, 2020Updated 5 years ago
hanxiao / demo-poems-ir
View on GitHub
Poems retrieval demo built with GNES framework
☆14Oct 3, 2019Updated 6 years ago
markuskiller / textblob-de
View on GitHub
German language support for TextBlob.
☆103Jan 7, 2025Updated last year
openredact / openredact-app
View on GitHub
This is a prototype of a semi-automatic data anonymization app for German documents. ➡️ The project has moved to: https://gitlab.opencode…
☆24Mar 20, 2026Updated 3 months ago
Germanet-sfs / germanetpy
View on GitHub
☆12Jun 10, 2026Updated 3 weeks ago
elenanereiss / Legal-Entity-Recognition
View on GitHub
A Dataset of German Legal Documents for Named Entity Recognition
☆179Oct 19, 2022Updated 3 years ago
dtuggener / CharSplit
View on GitHub
Compound splitter for German
☆112Apr 5, 2020Updated 6 years ago
SmokinCaterpillar / doc2vec_user_comments
View on GitHub
I analysed online user comments on articles by German news publishers SPON, ZEIT, and Focus
☆19Feb 3, 2018Updated 8 years ago
wjyandre / LimGen
View on GitHub
Automatic Limerick Generation
☆11Mar 18, 2021Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
stefan-it / german-gpt2
View on GitHub
German GPT-2 model
☆32Aug 17, 2021Updated 4 years ago
riedlma / SECOS
View on GitHub
An unsupervised compound splitter
☆41Oct 6, 2019Updated 6 years ago
DFKI-NLP / REval
View on GitHub
[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction
☆13Apr 21, 2020Updated 6 years ago
t-systems-on-site-services-gmbh / german-wikipedia-text-corpus
View on GitHub
This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings l…
☆23Feb 22, 2022Updated 4 years ago
cambridgeltl / post-specialisation
View on GitHub
Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources
☆12Apr 12, 2018Updated 8 years ago
OFAI / million-post-corpus
View on GitHub
Annotated data set consisting of user comments posted to a German-language newspaper website
☆18Jun 28, 2018Updated 8 years ago
RaRe-Technologies / talks
View on GitHub
Presentations & notebooks from our talks /workshops/meetups/etc
☆24Mar 23, 2018Updated 8 years ago
msg-systems / holmes-extractor
View on GitHub
Information extraction from English and German texts based on predicate logic
☆393Jul 8, 2022Updated 3 years ago
rsennrich / SMORLemma
View on GitHub
SMOR (Stuttgart Morphology) with alternative lemmatization component
☆13Aug 10, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tsproisl / SoMaJo
View on GitHub
A tokenizer and sentence splitter for German and English web and social media texts.
☆152Dec 9, 2024Updated last year
pd3f / dehyphen
View on GitHub
📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF
☆39Mar 8, 2022Updated 4 years ago
tastyminerals / ccrawl
View on GitHub
Simple CORPORA list crawler
☆11Dec 2, 2016Updated 9 years ago
DuyguA / DEMorphy
View on GitHub
German Morphological Analyzer
☆54Nov 12, 2021Updated 4 years ago
clp-research / slurk
View on GitHub
Slurk (think “slack for mechanical turk”…) is a lightweight and easily extensible chat server built especially for conducting multimodal …
☆15Dec 8, 2023Updated 2 years ago
BenjaminDHorne / Language-Features-for-News
View on GitHub
Language features used in the NELA Toolkit and other news studies
☆13Oct 14, 2020Updated 5 years ago
stopwords-iso / stopwords-de
View on GitHub
German stopwords collection
☆88Oct 6, 2022Updated 3 years ago