mainlp/germanic-lrl-corpora

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mainlp/germanic-lrl-corpora)

mainlp / germanic-lrl-corpora

Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource Languages and Dialects" (Blaschke et al., NoDaLiDa 2023).

☆28

Alternatives and similar repositories for germanic-lrl-corpora

Users that are interested in germanic-lrl-corpora are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dmg-illc / JUDGE-BENCH
View on GitHub
☆40Jul 24, 2025Updated last year
mainlp / CrossRE
View on GitHub
CrossRE: A Cross-Domain Dataset for Relation Extraction (Findings of EMNLP 2022)
☆49Aug 20, 2024Updated last year
cadia-lvl / icelandic-NLP-resources
View on GitHub
Overview of Icelandic NLP resources at a glance
☆18Jun 20, 2024Updated 2 years ago
jjzha / cartography-al
View on GitHub
Code base for the EMNLP 2021 Findings paper: Cartography Active Learning
☆14Jun 3, 2025Updated last year
Babelscape / echoes-from-alexandria
View on GitHub
This repository provides the source code used to automatically generate the book summarization datasets described in the paper titled "Ec…
☆10Apr 14, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cisnlp / GlotWeb
View on GitHub
[WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages
☆17Apr 14, 2026Updated 3 months ago
zentrum-lexikographie / dwdsmor
View on GitHub
SFST/SMOR/DWDS-based German Morphology
☆21Jun 25, 2026Updated last month
emorynlp / seq2seq-corenlp
View on GitHub
☆13Feb 7, 2023Updated 3 years ago
stefan-it / gc4lm
View on GitHub
GC4LM: A Colossal (Biased) language model for German
☆13May 2, 2021Updated 5 years ago
cisnlp / mPLM-Sim
View on GitHub
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
☆11Jan 19, 2024Updated 2 years ago
AxelSorensenDev / Eevee
View on GitHub
An Easy Annotation Tool for Natural Language Processing
☆12May 17, 2024Updated 2 years ago
mainlp / awesome-human-label-variation
View on GitHub
A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …
☆102Apr 15, 2024Updated 2 years ago
hectormartinez / ud_unsup_parser
View on GitHub
☆22Jun 22, 2022Updated 4 years ago
boschresearch / adversarial_meta_embeddings
View on GitHub
Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"
☆13Dec 14, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
LuisaMaerz / KnowMAN
View on GitHub
KnowMAN: Weakly Supervised Multinomial Adversarial Networks
☆12Nov 9, 2021Updated 4 years ago
machamp-nlp / machamp
View on GitHub
Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/
☆91Jun 3, 2026Updated last month
Kaleidophon / awesome-experimental-standards-deep-learning
View on GitHub
Repository collecting resources and best practices to improve experimental rigour in deep learning research.
☆27Mar 30, 2023Updated 3 years ago
UKPLab / nessie
View on GitHub
Automatically detect errors in annotated corpora.
☆48Sep 8, 2023Updated 2 years ago
iLanguage / iLanguage
View on GitHub
A semi-unsupervised language independent morphological analyzer useful for stemming unknown language text, or getting a rough estimate of…
☆22Nov 28, 2017Updated 8 years ago
cambridgeltl / ACL2022_tutorial_multilingual_dialogue
View on GitHub
Materials for "Natural Language Processing for Multilingual Task-Oriented Dialogue" Tutorial at ACL 2022
☆14May 21, 2022Updated 4 years ago
czcorpus / InterText_editor
View on GitHub
Editor for aligned parallel texts (personal desktop application).
☆20Jan 15, 2026Updated 6 months ago
mittagessen / curt
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
Gersigno / Hangman-Javascript
View on GitHub
A simple Javascript "pendu" (Hangman) game with an html/css interface and different difficulties (French words dictionary)
☆12Apr 5, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
coastalcph / rungsted
View on GitHub
Fast structured perceptron sequential labeler
☆15Dec 8, 2015Updated 10 years ago
laurieburchell / open-lid-dataset
View on GitHub
Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)
☆77Apr 1, 2025Updated last year
cisnlp / GlotScript
View on GitHub
[LREC 2024] 🖋 Resource and Tool for Writing System Identification
☆22Mar 29, 2026Updated 3 months ago
simonasnow / MultilingualPerspectivistNLU
View on GitHub
☆10May 30, 2024Updated 2 years ago
hfst / compmorph-course
View on GitHub
Jupyter notebooks for course "Computational Morphology with HFST".
☆21Oct 5, 2022Updated 3 years ago
cisnlp / multypo
View on GitHub
A Multilingual Keyboard Layout-Based Typo Generator
☆17Nov 23, 2025Updated 8 months ago
gautierdag / tokenizer-bench
View on GitHub
Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"
☆22Feb 14, 2024Updated 2 years ago
mideind / GreynirCorrect
View on GitHub
Spelling and grammar correction for Icelandic
☆18Dec 12, 2025Updated 7 months ago
noklesta / The-Oslo-Bergen-Tagger
View on GitHub
Morphosyntactic tagger for Norwegian bokmål and nynorsk
☆29Jun 20, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
cisnlp / GlotCC
View on GitHub
[NeurIPS 2024] 🕸 GlotCC Dataset and Pipline
☆21Apr 6, 2025Updated last year
dmg-illc / uid-dialogue
View on GitHub
A repository for the EMNLP 2021 paper "Is Information Density Uniform in Task-Oriented Dialogues?" and for the CoNLL 2021 paper "Analysin…
☆10Jun 17, 2024Updated 2 years ago
chatnoir-eu / web-content-extraction-benchmark
View on GitHub
Web Content Extraction Benchmark
☆28Dec 16, 2025Updated 7 months ago
CoEDL / vad-sli-asr
View on GitHub
A pipeline to isolate and transcribe one language in mixed-language speech
☆20Oct 25, 2022Updated 3 years ago
jkallini / mission-impossible-language-models
View on GitHub
Code repository for the paper "Mission: Impossible Language Models."
☆56Sep 25, 2025Updated 10 months ago
impresso / named-entity-tutorial-dh2019
View on GitHub
Tutorial on NE processing for Digital Humanities - DH Utrech 2019
☆24Jul 18, 2019Updated 7 years ago
JungeAlexander / cocoscore
View on GitHub
CoCoScore: context-aware co-occurrence scores for text mining applications
☆20Mar 30, 2019Updated 7 years ago