malteos/clp-transfer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/malteos/clp-transfer)

malteos / clp-transfer

Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning

☆30

Alternatives and similar repositories for clp-transfer

Users that are interested in clp-transfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

leonweber / pedl
View on GitHub
Search the biomedical literature for protein interactions and protein associations
☆11Nov 24, 2023Updated 2 years ago
konstantinjdobler / focus
View on GitHub
[EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"
☆37Jun 7, 2025Updated last year
uds-lsv / NoisyNER
View on GitHub
A dataset for realistic evaluation of noisy label methods
☆15Dec 3, 2023Updated 2 years ago
pawel-bujnowski / smiler
View on GitHub
SMiLER - Samsung MultiLingual Entity and Relation Extraction dataset
☆18Feb 11, 2021Updated 5 years ago
alexa / ramen
View on GitHub
A software for transferring pre-trained English models to foreign languages
☆20Mar 20, 2023Updated 3 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
AnesBenmerzoug / langsfer
View on GitHub
A library for language transfer methods and algorithms.
☆16Feb 6, 2026Updated 5 months ago
mprompting / xlmrprompt
View on GitHub
☆11Jun 23, 2022Updated 4 years ago
uds-lsv / TOKEN-is-a-MASK
View on GitHub
Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"
☆14Aug 19, 2022Updated 3 years ago
flairNLP / familiarity
View on GitHub
Label shift estimation for transfer difficulty with Familiarity.
☆10Feb 4, 2025Updated last year
dennlinger / klexikon
View on GitHub
Klexikon: A German Dataset for Joint Summarization and Simplification
☆17Oct 5, 2022Updated 3 years ago
StephAO / olfmlm
View on GitHub
☆18Nov 25, 2022Updated 3 years ago
UKPLab / acl2024-triple-encoders
View on GitHub
triple-encoders is a library for contextualizing distributed Sentence Transformers representations.
☆15Sep 3, 2024Updated last year
cisnlp / MEXA
View on GitHub
[ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
☆11Apr 6, 2025Updated last year
krangelie / bias-in-german-nlg
View on GitHub
Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies regard classification and bias mitigation triggers.
☆16Sep 25, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
wietsedv / xpos
View on GitHub
Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages (ACL 2022)
☆19May 17, 2022Updated 4 years ago
joernhees / userdocker
View on GitHub
Allow admins to grant restricted docker/nvidia-docker commandline access to users.
☆19Jun 11, 2021Updated 5 years ago
DFKI-NLP / MobIE
View on GitHub
[Konvens21] This repository contains the DFKI MobIE Corpus, a dataset of 3,232 German-language documents that have been annotated with fi…
☆12Sep 17, 2024Updated last year
gautierdag / tokenizer-bench
View on GitHub
Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"
☆22Feb 14, 2024Updated 2 years ago
nverma1 / merging-text-transformers
View on GitHub
Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year
McGill-NLP / latent-translation
View on GitHub
Code for the paper "Modelling Latent Translations for Cross-Lingual Transfer"
☆17Nov 22, 2021Updated 4 years ago
bminixhofer / zett
View on GitHub
Code for Zero-Shot Tokenizer Transfer
☆145Jan 14, 2025Updated last year
visinf / cos-cvae
View on GitHub
Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)
☆37May 16, 2022Updated 4 years ago
stefan-it / ukrainian-electra
View on GitHub
Ukrainian ELECTRA model
☆12Mar 11, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ClimSocAna / tecb-de
View on GitHub
German Text Embedding Clustering Benchmark
☆19Mar 15, 2024Updated 2 years ago
bltlab / mot
View on GitHub
Multilingual Open Text
☆26May 8, 2025Updated last year
trusthlt / eacl24-german-legal-questions
View on GitHub
Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24
☆16Mar 2, 2024Updated 2 years ago
mrpeerat / SCT
View on GitHub
SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)
☆16Jul 27, 2024Updated last year
malteos / awesome-prompt-optimization
View on GitHub
A curated collection of resources for prompt engineering, optimization, and automatic prompt generation across text, image, video, and mu…
☆18Sep 24, 2025Updated 10 months ago
openlegaldata / legal-reference-extraction
View on GitHub
Legal Reference Extraction
☆49Jun 15, 2026Updated last month
google-research-datasets / QAmeleon
View on GitHub
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…
☆34Aug 15, 2023Updated 2 years ago
JoelNiklaus / LawInstruct
View on GitHub
This repository is a collection of legal instruction datasets
☆28Jul 12, 2024Updated 2 years ago
malteos / awesome-anonymization-for-llms
View on GitHub
A collection of resources for PII detection, anonymization, privacy-preserving techniques, and GDPR compliance in Large Language Model (L…
☆19Sep 24, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jjzha / cartography-al
View on GitHub
Code base for the EMNLP 2021 Findings paper: Cartography Active Learning
☆14Jun 3, 2025Updated last year
gucci-j / light-transformer-emnlp2021
View on GitHub
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
☆34Nov 21, 2021Updated 4 years ago
UKPLab / acl2020-interactive-entity-linking
View on GitHub
☆33Sep 7, 2023Updated 2 years ago
ArneBinder / pytorch-ie
View on GitHub
PyTorch-IE: State-of-the-art Information Extraction in PyTorch
☆76Sep 24, 2025Updated 10 months ago
discourse-lab / DiscourseSegmenter
View on GitHub
A collection of various discourse segmenters
☆10Jun 30, 2017Updated 9 years ago
HKUNLP / multilingual-transfer
View on GitHub
Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“
☆15Jun 13, 2023Updated 3 years ago
mshukor / eP-ALM
View on GitHub
[ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.
☆27Oct 27, 2023Updated 2 years ago