DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models
☆159Dec 6, 2022Updated 3 years ago
Alternatives and similar repositories for berts
Users that are interested in berts are comparing it to the libraries listed below
Sorting:
- German GPT-2 model☆32Aug 17, 2021Updated 4 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆38Dec 14, 2021Updated 4 years ago
- Plan and train German transformer models.☆23Feb 22, 2021Updated 5 years ago
- Fine-tuned Transformers compatible BERT models for Sequence Tagging☆40Jul 17, 2020Updated 5 years ago
- Parsing only with Pretraining Networks☆16Jul 25, 2024Updated last year
- Named Entity Recognition☆19Feb 13, 2026Updated 3 weeks ago
- This is a german ELMo deep contextualized word representation. It is trained on a special German Wikipedia Text Corpus.☆28Dec 15, 2019Updated 6 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆28Oct 3, 2021Updated 4 years ago
- Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.☆1,752Dec 20, 2023Updated 2 years ago
- Ukrainian ELECTRA model☆12Mar 11, 2023Updated 2 years ago
- Repository for "Towards Robust Named Entity Recognition for Historic German"☆18Dec 11, 2020Updated 5 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆20Mar 27, 2023Updated 2 years ago
- ☆20Jan 9, 2026Updated last month
- This repository contains all manually labeled data from the GermEval-2018 shared task.☆29Sep 28, 2018Updated 7 years ago
- ☆13Apr 16, 2021Updated 4 years ago
- ☆13Dec 17, 2021Updated 4 years ago
- German Parliamentary Corpus (GerParCor)☆30Jan 14, 2026Updated last month
- This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings l…☆23Feb 22, 2022Updated 4 years ago
- Analyzing mBERT's multilinguality in a small laboratory setting☆13Jun 12, 2023Updated 2 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Apr 21, 2021Updated 4 years ago
- This is a prototype of a Python module for simple modification of document files.☆18Jan 8, 2022Updated 4 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆87Nov 7, 2022Updated 3 years ago
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆18Dec 6, 2022Updated 3 years ago
- Generate BERT vocabularies and pretraining examples from Wikipedias☆17May 11, 2020Updated 5 years ago
- RelEx - A simple framework for Relation Extraction built on AllenNLP☆15Jun 17, 2020Updated 5 years ago
- A Dataset of German Legal Documents for Named Entity Recognition☆174Oct 19, 2022Updated 3 years ago
- Temporary remove unused tokens during training to save ram and speed.☆23Jun 15, 2025Updated 8 months ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆26Feb 16, 2026Updated 2 weeks ago
- Code and data for: Low Resource Grammatical Error Correction Using Wikipedia Edits (WNUT 2018)☆17Jul 16, 2024Updated last year
- The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification, EMNLP-Findings 2020.☆18Aug 27, 2021Updated 4 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- A software for transferring pre-trained English models to foreign languages☆19Mar 20, 2023Updated 2 years ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Mar 8, 2022Updated 3 years ago
- Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer" https://arxiv.org/pdf/2112.14569.pdf☆20Dec 28, 2021Updated 4 years ago
- Transparenzranking.de vergleicht alle Transparenzregelungen Deutschlands☆12Nov 22, 2023Updated 2 years ago
- NLP Examples using the 🤗 libraries☆40Feb 21, 2021Updated 5 years ago
- ☆28Sep 13, 2022Updated 3 years ago
- A Streamlit app to add structured tags to a dataset card☆22Jun 30, 2022Updated 3 years ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆25Jul 18, 2019Updated 6 years ago