A software for transferring pre-trained English models to foreign languages
☆19Mar 20, 2023Updated 2 years ago
Alternatives and similar repositories for ramen
Users that are interested in ramen are comparing it to the libraries listed below
Sorting:
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Sep 17, 2021Updated 4 years ago
- ☆10Dec 17, 2020Updated 5 years ago
- Reasoning-based Evaluation and Ranking of Translations.☆19Jul 18, 2025Updated 7 months ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24☆14Mar 2, 2024Updated 2 years ago
- Research into identifying and correcting incorrect labels in the CoNLL-2003 corpus.☆12May 11, 2021Updated 4 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆28Oct 3, 2021Updated 4 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆88Sep 12, 2024Updated last year
- Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained Language Models" paper.☆32Sep 26, 2023Updated 2 years ago
- This repository contains the code for applying One-Token Approximation to a pretrained language model using subword-level tokenization.☆11May 7, 2020Updated 5 years ago
- docker for HF wav2vec2-sprint☆13Mar 26, 2021Updated 4 years ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- ☆19Sep 16, 2025Updated 5 months ago
- ☆16Dec 14, 2022Updated 3 years ago
- German Text Embedding Clustering Benchmark☆18Mar 15, 2024Updated last year
- ☆13Dec 17, 2021Updated 4 years ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Jul 28, 2022Updated 3 years ago
- ☆13Nov 11, 2022Updated 3 years ago
- GC4LM: A Colossal (Biased) language model for German☆13May 2, 2021Updated 4 years ago
- ☆15Jun 14, 2024Updated last year
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Apr 30, 2023Updated 2 years ago
- Analyzing mBERT's multilinguality in a small laboratory setting☆13Jun 12, 2023Updated 2 years ago
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognition☆15Jun 28, 2023Updated 2 years ago
- Exploring semantic similarities between contextualized embeddings☆14May 18, 2021Updated 4 years ago
- German GPT-2 model☆32Aug 17, 2021Updated 4 years ago
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Mar 14, 2022Updated 3 years ago
- Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages (ACL 2022)☆19May 17, 2022Updated 3 years ago
- A web interface to understand language-specific BERT-models☆18Apr 16, 2024Updated last year
- CD20200004 from 01/01/2021 to 31/12/2023 - LIG UGA - Python Notebook and Models for the MT Lab @ ALPS 2022☆13Apr 1, 2024Updated last year
- PyTorch source code of NAACL 2021 paper "Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Tran…☆18Oct 18, 2022Updated 3 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- Fine-tuned Transformers compatible BERT models for Sequence Tagging☆40Jul 17, 2020Updated 5 years ago
- Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer" https://arxiv.org/pdf/2112.14569.pdf☆20Dec 28, 2021Updated 4 years ago
- ☆20Jan 16, 2024Updated 2 years ago
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.☆24Sep 24, 2023Updated 2 years ago
- This is a german ELMo deep contextualized word representation. It is trained on a special German Wikipedia Text Corpus.☆28Dec 15, 2019Updated 6 years ago
- Annotated corpus + evaluation metrics for text anonymisation☆71Jan 19, 2026Updated last month
- Hans-Georg Maaßen and the Retweets☆21Aug 26, 2019Updated 6 years ago