stefan-it/europeana-bert

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/stefan-it/europeana-bert)

stefan-it / europeana-bert

BERT and ELECTRA models trained on Europeana Newspapers

☆39

Alternatives and similar repositories for europeana-bert

Users that are interested in europeana-bert are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stefan-it / gc4lm
View on GitHub
GC4LM: A Colossal (Biased) language model for German
☆13May 2, 2021Updated 5 years ago
dbmdz / historic-ner
View on GitHub
Repository for "Towards Robust Named Entity Recognition for Historic German"
☆18Dec 11, 2020Updated 5 years ago
dbmdz / berts
View on GitHub
DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models
☆158Dec 6, 2022Updated 3 years ago
German-NLP-Group / german-transformer-training
View on GitHub
Plan and train German transformer models.
☆23Feb 22, 2021Updated 5 years ago
stefan-it / german-gpt2
View on GitHub
German GPT-2 model
☆32Aug 17, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gooofy / transformer-lm
View on GitHub
Transformer language model (GPT-2) with sentencepiece tokenizer
☆10Oct 15, 2019Updated 6 years ago
UB-Mannheim / GTCheck
View on GitHub
Check your modified Ground Truth files with visual support!
☆10Jan 31, 2024Updated 2 years ago
JKamlah / tesseractXplore
View on GitHub
tesseractXplore a tesseract ease of use gui with full control
☆26Nov 10, 2021Updated 4 years ago
impresso / CLEF-HIPE-2020
View on GitHub
Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…
☆21Aug 1, 2024Updated last year
hnesk / browse-ocrd
View on GitHub
An extensible viewer for OCR-D mets.xml files
☆23May 30, 2024Updated 2 years ago
bertsky / ocrd_publaynet
View on GitHub
convert PubLayNet data into METS/PAGE-XML
☆10Mar 17, 2020Updated 6 years ago
qurator-spk / mods4pandas
View on GitHub
Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis
☆15Aug 21, 2025Updated 11 months ago
tonianelope / Multilingual-BERT
View on GitHub
Investigating multilingual language models (BERT) by using them for NER in German and English
☆14Apr 30, 2019Updated 7 years ago
dbmdz / clef-hipe
View on GitHub
Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions
☆20Mar 27, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ulb-sachsen-anhalt / ulb-zeitungsprojekt-hp1
View on GitHub
Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"
☆12Dec 17, 2021Updated 4 years ago
t-systems-on-site-services-gmbh / german-wikipedia-text-corpus
View on GitHub
This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings l…
☆23Feb 22, 2022Updated 4 years ago
LEL-A / GerAlpacaDataCleaned
View on GitHub
German Alpaca Dataset (Cleaned + Translated)
☆26Apr 6, 2023Updated 3 years ago
qurator-spk / sbb_ner
View on GitHub
Named Entity Recognition
☆19Feb 13, 2026Updated 5 months ago
andbue / nashi
View on GitHub
Some bits of javascript to transcribe scanned pages using PageXML
☆17May 27, 2026Updated 2 months ago
dhlab-epfl / dhSegment-text
View on GitHub
Fork of dhSegment for experiments on visual and textual feature combination.
☆15Jan 30, 2021Updated 5 years ago
OCR-D / page-to-alto
View on GitHub
Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)
☆17Jun 5, 2026Updated last month
UB-Mannheim / AustrianNewspapers
View on GitHub
NewsEye / READ OCR training dataset from Austrian Newspapers (1864–1911)
☆18Oct 31, 2025Updated 8 months ago
manueltonneau / covid-berts
View on GitHub
BERT models pretrained on the CORD-19 Kaggle dataset
☆15Jun 8, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
tsproisl / SoMaJo
View on GitHub
A tokenizer and sentence splitter for German and English web and social media texts.
☆153Dec 9, 2024Updated last year
impresso / named-entity-tutorial-dh2019
View on GitHub
Tutorial on NE processing for Digital Humanities - DH Utrech 2019
☆24Jul 18, 2019Updated 7 years ago
mounicam / hashtag_master
View on GitHub
HashtagMaster: Segmentation tool for hashtags
☆12Oct 27, 2020Updated 5 years ago
iFede94 / ArtDL
View on GitHub
☆10Dec 22, 2022Updated 3 years ago
krksgbr / glyphcollector
View on GitHub
☆64Jan 4, 2023Updated 3 years ago
yamac-kurtulus / Windows-Docker-Images
View on GitHub
Some Windows images for tool images that I had to use in a Windows Environment.
☆10Sep 27, 2020Updated 5 years ago
hipe-eval / HIPE-2022-data
View on GitHub
Data for the HIPE 2022 shared task.
☆23May 15, 2026Updated 2 months ago
t-systems-on-site-services-gmbh / german-elmo-model
View on GitHub
This is a german ELMo deep contextualized word representation. It is trained on a special German Wikipedia Text Corpus.
☆28Dec 15, 2019Updated 6 years ago
UB-Mannheim / Fibeln
View on GitHub
Transkriptionen von Fibeln (19. Jahrhundert)
☆11Oct 31, 2025Updated 8 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
gitabtion / ConvBert-PyTorch
View on GitHub
🤗An unofficial PyTorch implementation of ConvBert based on huggingface/transformers.
☆17Oct 6, 2022Updated 3 years ago
Living-with-machines / genre-classification
View on GitHub
Jupyter book showing how to build an ML powered book genre classifier
☆13Oct 16, 2024Updated last year
WHaverals / CERberus
View on GitHub
CERberus -- guardian against character errors
☆30Jul 3, 2026Updated 3 weeks ago
edobobo / p-lightning-template
View on GitHub
☆36Mar 26, 2022Updated 4 years ago
filak / hOCR-to-ALTO
View on GitHub
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
☆60Mar 20, 2026Updated 4 months ago
mauvilsa / nw-page-editor
View on GitHub
Simple app for visual editing of Page XML files
☆32Sep 25, 2025Updated 10 months ago
OCR-D / ocrd_pagetopdf
View on GitHub
OCR-D wrapper for prima-pagetopdf
☆10Oct 30, 2025Updated 8 months ago