German-NLP-Group/german-transformer-training

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/German-NLP-Group/german-transformer-training)

German-NLP-Group / german-transformer-training

Plan and train German transformer models.

☆23

Alternatives and similar repositories for german-transformer-training

Users that are interested in german-transformer-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stefan-it / gc4lm
View on GitHub
GC4LM: A Colossal (Biased) language model for German
☆13May 2, 2021Updated 5 years ago
stefan-it / europeana-bert
View on GitHub
BERT and ELECTRA models trained on Europeana Newspapers
☆39Dec 14, 2021Updated 4 years ago
t-systems-on-site-services-gmbh / german-elmo-model
View on GitHub
This is a german ELMo deep contextualized word representation. It is trained on a special German Wikipedia Text Corpus.
☆28Dec 15, 2019Updated 6 years ago
LEL-A / GerAlpacaDataCleaned
View on GitHub
German Alpaca Dataset (Cleaned + Translated)
☆26Apr 6, 2023Updated 3 years ago
stefan-it / german-gpt2
View on GitHub
German GPT-2 model
☆32Aug 17, 2021Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
malteos / semantic-document-relations
View on GitHub
Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between…
☆31Jun 12, 2023Updated 3 years ago
shawwn / tpunicorn
View on GitHub
Babysit your preemptible TPUs
☆86Dec 3, 2022Updated 3 years ago
jfilter / hgmaassen-retweets
View on GitHub
Hans-Georg Maaßen and the Retweets
☆21Aug 26, 2019Updated 6 years ago
yamac-kurtulus / Windows-Docker-Images
View on GitHub
Some Windows images for tool images that I had to use in a Windows Environment.
☆10Sep 27, 2020Updated 5 years ago
t-systems-on-site-services-gmbh / german-wikipedia-text-corpus
View on GitHub
This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings l…
☆23Feb 22, 2022Updated 4 years ago
oscar-project / goclassy
View on GitHub
An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.
☆86Apr 21, 2021Updated 5 years ago
EdCo95 / text-summarization
View on GitHub
Python code to automatically produce a summary of a piece of text.
☆11Sep 8, 2016Updated 9 years ago
OCR-D / ocrd_kraken
View on GitHub
Wrapper for the kraken OCR engine
☆12Jul 12, 2025Updated last year
MaviccPRP / ger_ner_evals
View on GitHub
☆37Nov 16, 2017Updated 8 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
gathertown / twitch-plays-gather
View on GitHub
☆11Jul 25, 2024Updated last year
stefan-it / ukrainian-electra
View on GitHub
Ukrainian ELECTRA model
☆12Mar 11, 2023Updated 3 years ago
TianchunH97 / fairseq-rl
View on GitHub
Modified version of fairseq, including new implementations for criterions using reinforcement learning methods.
☆11Aug 14, 2019Updated 6 years ago
julien-nc / integration_suitecrm
View on GitHub
Integration of SuiteCRM into Nextcloud
☆19Nov 12, 2021Updated 4 years ago
girvandip / SIA-influence-maximization
View on GitHub
Implementation of two influence maximization algorithms : LDAG and NewGreedyIC (NGIC)
☆13May 28, 2019Updated 7 years ago
chongzhangFDU / TPP
View on GitHub
This is the official repository of the EMNLP 2023 paper Reading Order Matters: Information Extraction from Visually-rich Documents by Tok…
☆18Mar 15, 2024Updated 2 years ago
natliblux / BnLMetsExporter
View on GitHub
Command Line Interface (CLI) to export METS/ALTO documents to other formats.
☆13Apr 25, 2022Updated 4 years ago
openlegaldata / oldp-notebooks
View on GitHub
Jupyter notebook showcases using the Open Legal Data API
☆26Dec 22, 2025Updated 6 months ago
balzer82 / PegidaSprache
View on GitHub
Analyse des Pegida facebook Korpus
☆10Jan 31, 2015Updated 11 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
openlegaldata / legal-reference-extraction
View on GitHub
Legal Reference Extraction
☆49Jun 15, 2026Updated last month
hanxiao / demo-poems-ir
View on GitHub
Poems retrieval demo built with GNES framework
☆14Oct 3, 2019Updated 6 years ago
microsoft / XGLUE
View on GitHub
Cross-lingual GLUE
☆49Jun 15, 2023Updated 3 years ago
eval4nlp / SharedTask2021
View on GitHub
☆17Nov 23, 2021Updated 4 years ago
tsproisl / SoMaJo
View on GitHub
A tokenizer and sentence splitter for German and English web and social media texts.
☆153Dec 9, 2024Updated last year
OCR-D / page-to-alto
View on GitHub
Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)
☆17Jun 5, 2026Updated last month
stefan-it / fine-tuned-berts-seq
View on GitHub
Fine-tuned Transformers compatible BERT models for Sequence Tagging
☆40Jul 17, 2020Updated 6 years ago
hbz / lobid
View on GitHub
Linking Open Bibliographic Data
☆17Jun 25, 2026Updated 3 weeks ago
JonasRieger / rollinglda
View on GitHub
A rolling version of the Latent Dirichlet Allocation.
☆13Nov 27, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
aidantee / D3NER
View on GitHub
A CRF-biLSTM based Biomedical NER model in Bioinformatics 2018.
☆24Jul 31, 2018Updated 7 years ago
krangelie / bias-in-german-nlg
View on GitHub
Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies regard classification and bias mitigation triggers.
☆16Sep 25, 2024Updated last year
ScJa / document-search-engine
View on GitHub
A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.
☆17May 8, 2018Updated 8 years ago
bjut-hz / py-mate-tools
View on GitHub
python interface for mate tools
☆17Jan 23, 2018Updated 8 years ago
lavis-nlp / german_legal_sentences
View on GitHub
A dataset of semantically related sentence pairs in the German legal domain
☆10Feb 26, 2021Updated 5 years ago
ArneBinder / GlomImpl
View on GitHub
Implementation of the GLOM model for text
☆11Mar 4, 2021Updated 5 years ago
babaknaderi / TextComplexityDE
View on GitHub
TextComplexityDE dataset consists of 1000 sentences in the German language with subjective complexity rating, collected from German learn…
☆12Apr 8, 2022Updated 4 years ago